《基于CT的人工智能预后模型在非小细胞肺癌切除术中的临床应用》| 文献速递-基于深度学习的乳房、前列腺疾病诊断系统

发布于:2024-08-16 ⋅ 阅读:(69) ⋅ 点赞:(0)

Title

题目

Clinical Utility of a CT-based AI Prognostic Model  for Segmentectomy in Non–Small Cell Lung Cancer

《基于CT的人工智能预后模型在非小细胞肺癌切除术中的临床应用》

Background

背景

Currently, no tool exists for risk stratification in patients undergoing segmentectomy for non–small cell lung cancer (NSCLC).

目前,尚无针对接受肺段切除术的非小细胞肺癌(NSCLC)患者的风险分层工具。

Method

方法

In this single-center retrospective study, transfer learning of a pretrained model was performed for survival prediction in patients with clinical stage IA NSCLC who underwent lobectomy from January 2008 to March 2017. The internal set was divided into training, validation, and testing sets based on the assignments from the pretraining set. The model was tested on an independent test set of patients with clinical stage IA NSCLC who underwent segmentectomy from January 2010 to December Its prognostic performance was analyzed using the time-dependent area under the receiver operating characteristic curve (AUC), sensitivity, and specificity for freedom from recurrence (FFR) at 2 and 4 years and lung cancer–specific survival and overall survival at 4 and 6 years. The model sensitivity and specificity were compared with those of the Japan Clinical Oncology Group (JCOG) eligibility criteria for sublobar resection.

在这项单中心回顾性研究中,采用了预训练模型的迁移学习来预测接受肺叶切除术的临床IA期非小细胞肺癌(NSCLC)患者的生存情况,研究时间为2008年1月至2017年3月。内部数据集根据预训练集的分配被划分为训练集、验证集和测试集。该模型在独立的测试集上进行了测试,该测试集包括2010年1月至2017年12月期间接受肺段切除术的临床IA期NSCLC患者。通过时间依赖的受试者工作特性曲线下面积(AUC)、灵敏度和特异性分析模型的预后性能,评估了2年和4年的复发自由(FFR)、4年和6年的肺癌特异性生存率和总生存率。将模型的灵敏度和特异性与日本临床肿瘤学组(JCOG)对亚肺叶切除术的适应标准进行了比较。

Conclusion

结论

The CT-based DL model identified patients at high risk among those with clinical stage IA NSCLC who underwent segmentectomy, outperforming the JCOG criteria.

基于CT的深度学习(DL)模型在接受肺段切除术的临床IA期非小细胞肺癌(NSCLC)患者中识别出了高风险患者,其表现优于JCOG标准。

Results

结果

The pretraining set included 1756 patients. Transfer learning was performed in an internal set of 730 patients (median age, 63 years [IQR, 56–70 years]; 366 male), and the segmentectomy test set included 222 patients (median age, 65 years [IQR, 58–71 years]; 114 male). The model performance for 2-year FFR was as follows: AUC, 0.86 (95% CI: 0.76, 0.96); sensitivity, 87.4% (7.17 of 8.21 patients; 95% CI: 59.4, 100); and specificity, 66.7% (136 of 204 patients; 95% CI: 60.2, 72.8). The model showed higher sensitivity for FFR than the JCOG criteria (87.4% vs 37.6% [3.08 of 8.21 patients], P = .02), with similar specificity.

预训练集包括1756名患者。迁移学习在730名患者的内部数据集中进行(中位年龄63岁 [四分位数范围,56–70岁];其中366名为男性),而肺段切除术测试集包括222名患者(中位年龄65岁 [四分位数范围,58–71岁];其中114名为男性)。模型对2年复发自由(FFR)的性能如下:AUC为0.86(95%置信区间:0.76, 0.96);灵敏度为87.4%(8.21名患者中的7.17名;95%置信区间:59.4, 100);特异性为66.7%(204名患者中的136名;95%置信区间:60.2, 72.8)。与JCOG标准相比,该模型在FFR的灵敏度上更高(87.4% vs 37.6% [8.21名患者中的3.08名],P = .02),特异性相似。

Figure

图片

Figure 1: Schematic shows the overall study design. The pretraining set included patients with non–small cell lung cancer with any tumor size and lymph node involvement but without metastasis, confirmed at pathologic examination (pTanyNanyM0). LN = lymph node, LVI = lymphovascular invasion, 3D = three-dimensional, VPI = visceral pleural invasion.

图1: 示意图展示了整体研究设计。预训练集包括非小细胞肺癌患者,这些患者的肿瘤大小和淋巴结受累情况各异,但没有转移,病理检查确认为(pTanyNanyM0)。LN = 淋巴结,LVI = 淋巴血管侵犯,3D = 三维,VPI = 胶质层侵袭。

图片

Figure 2: Flowcharts show patient inclusion and exclusion for the (A) pretraining and internal set and (B) independent segmentectomy test set. The pretraining set (A) (including patients with any tumor size and lymph node involvement, but without metastasis, confirmed at pathologic examination [pTanyNanyM0]) was split randomly for training, validation, and testing at a ratio of 6:2:2. Transfer learning was applied to a subset of the pretraining set, specifically in patients with clinical stage IA non–small cell lung cancer (ie, the internal set). The internal set was divided based on the assignments from the pretraining set, such that the ratio of the internal training set (ie, for transfer learning) to the internal validation set to the internal testing set was 6:1.8:2.2. FEV1 = forced expiratory volume in first second of exhalation.

图2: 流程图展示了(A) 预训练和内部数据集以及(B) 独立肺段切除术测试集的患者纳入和排除情况。预训练集(A)(包括任何肿瘤大小和淋巴结受累情况,但没有转移,病理检查确认为[pTanyNanyM0])被随机划分为训练集、验证集和测试集,比例为6:2:2。迁移学习应用于预训练集的一个子集,特别是临床IA期非小细胞肺癌(即内部数据集)的患者。内部数据集基于预训练集的分配进行划分,内部训练集(即用于迁移学习)的比例与内部验证集和内部测试集的比例为6:1.8:2.2。FEV1 = 一秒钟用力呼气量。

Figure 3: Kaplan-Meier survival curves stratified according to the dichotomized deep learning (DL)–driven risk scores show (A) overall survival (OS) in the internal test set using the DL-driven 4-year risk scores and (B) freedom from recurrence, (C) lung cancer–specific survival, and (D) OS in the segmentectomy test set using the DL-driven 2-year, 4-year, and 6-year risk scores. The cutoffs were determined empirically as the median values in the internal validation set, which were 1.36% for the DL-driven 2-year risk score and 4.36% for the 4-year risk score. The cutoffs remained unchanged regardless of the study outcome.

图3: 根据二分法深度学习(DL)驱动的风险评分进行分层的Kaplan-Meier生存曲线展示了(A) 内部测试集中使用DL驱动的4年风险评分的总体生存率(OS),(B) 复发自由、(C) 肺癌特异性生存率以及(D) 在肺段切除术测试集中使用DL驱动的2年、4年和6年风险评分的OS。分界点根据内部验证集中经验确定的中位值进行设定,其中DL驱动的2年风险评分为1.36%,4年风险评分为4.36%。无论研究结果如何,分界点保持不变。

图片

Figure 4: Kaplan-Meier survival curves according to the dichotomized deep learning (DL)–driven risk scores in segmentectomy subgroups of patients who met clinical trial eligibility. (A–C) Graphs show freedom from recurrence (FFR) (A), lung cancer–specific survival (LCSS) (B), and overall survival (OS) (C) in patients eligible for the Cancer and Leukemia Group B 140503 trial. (D–F) Graphs show FFR (D), LCSS (E), and OS (F) in patients eligible for the Japan Clinical Oncology Group (JCOG) trials (JCOG0802, JCOG1211, and JCOG0804). The DL-driven 2-year risk score was used for FFR, and the 4-year risk score was used for LCSS and OS. The cutoffs were determined empirically as the median values in the internal validation set, which were 1.36% for the DL-driven 2-year risk score and 4.36% for the 4-year risk score. The cutoffs were not altered according to the study outcomes. In the segmentectomy test set, the same patients were consistently classified into the DL-based low-risk group across different time points (2, 4, and 6 years).

图4: 根据二分法深度学习(DL)驱动的风险评分在符合临床试验资格的肺段切除术患者亚组中的Kaplan-Meier生存曲线。(A–C) 图表展示了符合癌症和白血病组B 140503试验资格的患者的复发自由(FFR)(A),肺癌特异性生存率(LCSS)(B),以及总体生存率(OS)(C)。(D–F) 图表展示了符合日本临床肿瘤学组(JCOG)试验(JCOG0802、JCOG1211和JCOG0804)资格的患者的FFR (D),LCSS (E),以及OS (F)。DL驱动的2年风险评分用于FFR,4年风险评分用于LCSS和OS。分界点根据内部验证集中经验确定的中位值为2年风险评分的1.36%和4年风险评分的4.36%。这些分界点没有根据研究结果进行调整。在肺段切除术测试集中,相同的患者在不同时间点(2年、4年和6年)中始终被分类为DL驱动的低风险组。

图片

Figure 5: Representative CT images with heat map visualization. From left to right: Axial nonenhanced CT images show a preoperative scan with overlaid gradient-weighted activation maps for visceral pleural invasion, lymphovascular invasion, lymph node, and survival prediction, respectively. (A) Images in an 83-year-old male patient with clinical stage IA3 adenocarcinoma. The deep learning (DL)–driven 2-year risk score was 3.65% and the 4-year risk score was 10.2%. The tumor recurred 36.5 months after surgery. (B) Images in a 79-year-old female patient with clinical stage IA2 adenocarcinoma. The DL-driven 2-year risk score was 8.60% and the 4-year risk score was 20.2%. Tumor recurrence was observed 25.9 months after surgery. (C) Images in a 71-year-old male patient with clinical stage IA2 adenocarcinoma. The DL-driven 2-year risk score was 0.16% and the 4-year risk score was 0.74%. There was no evidence of disease recurrence until 60.8 months of postoperative follow-up. (D) Images in a 76-year-old female patient with clinical stage IA3 adenocarcinoma. The DL-driven 2-year risk score was 0.63% and the 4-year risk score was 2.28%. No recurrence was noted at a 22-month follow-up visit. The DL model predicted the cumulative overall survival probability in patients with clinical stage IA lung cancer, and the prediction was enhanced by the multitask learning of CT features for visceral pleural invasion, lymphovascular invasion, and lymph node metastasis. The color bar transitions from dark blue to dark red, indicating pixel activation ranging from a low to high degree on the heat maps.

图5: 具有热图可视化的代表性CT图像。自左至右:轴向非增强CT图像显示了预操作扫描,上面叠加了梯度加权激活图,用于可视化脏器胸膜侵犯、淋巴血管侵犯、淋巴结和生存预测。(A) 图像为一名83岁男性临床IA3期腺癌患者。深度学习(DL)驱动的2年风险评分为3.65%,4年风险评分为10.2%。肿瘤在手术后36.5个月复发。(B) 图像为一名79岁女性临床IA2期腺癌患者。DL驱动的2年风险评分为8.60%,4年风险评分为20.2%。肿瘤在手术后25.9个月复发。(C) 图像为一名71岁男性临床IA2期腺癌患者。DL驱动的2年风险评分为0.16%,4年风险评分为0.74%。在60.8个月的术后随访期间未发现疾病复发。(D) 图像为一名76岁女性临床IA3期腺癌患者。DL驱动的2年风险评分为0.63%,4年风险评分为2.28%。在22个月的随访中未发现复发。DL模型预测了临床IA期肺癌患者的累积总体生存概率,并通过多任务学习CT特征(包括脏器胸膜侵犯、淋巴血管侵犯和淋巴结转移)增强了预测效果。色条从深蓝到深红,表示热图上像素激活的低到高程度

Table

图片

Table 1: Patient and Tumor Characteristics

表1: 患者和肿瘤特征

图片

Table 2: Prognostication Using the DL-driven Risk Scores in the Internal and Independent Segmentectomy Test Sets

表2: 使用深度学习驱动的风险评分在内部数据集和独立肺段切除术测试集中的预后分析

图片

Table 3: Risk Stratification Benchmarking Against the Randomized Clinical Trial Eligibility Criteria

表3: 风险分层与随机临床试验资格标准的比较

图片

Table 4: Multivariable Cox Regression Analyses for the Independent Segmentectomy Test Set

表4: 独立肺段切除术测试集的多变量Cox回归分析