查看论文信息

免费浏览

附件下载

查看论文信息

论文题名(中文)：	基于影像组学的肺结节临床综合辅助诊断系统（MINs）的建立与应用
姓名：	李原
论文语种：	chi
学位：	博士
学位类型：	专业学位
学校：	北京协和医学院
院系：	北京协和医学院肿瘤医院
专业：	临床医学-肿瘤学
指导教师姓名：	高树庚
论文完成日期：	2024-04-07
论文题名(外文)：	Establishment and Application of a Radiomics-Based Clinical Auxiliary Diagnosis System (MINs) for Pulmonary Nodules
关键词(中文)：	肺结节影像组学预测模型风险分级辅助诊断
关键词(外文)：	Pulmonary nodules Radiomics Predictive model Risk stratification Auxiliary diagnosis
论文文摘（中文）：	︿第一部分构建基于影像组学的肺结节良恶性风险预测模型目的：基于影像组学技术，结合多中心的临床数据，构建风险预测模型用以鉴别肺结节中的良性病变和肺腺癌，探讨其在肺腺癌早期筛查中的应用价值。方法：本研究回顾性收集分析了国内7个医疗中心，自2015年10月至2020年9月间的1313例肺结节患者的临床信息、影像数据、病理资料，其中4个医疗中心的1211例肺结节（良性:腺癌=570:641）作为总训练集，3个医疗中心的102例肺结节（良性:腺癌=48:54）作为外部验证集。对所有结节的兴趣区（region of intertest，ROI）进行图像分割后，提取1316个影像组学特征。将总训练集按3:1的比例随机分为模型训练组和内部测试组，通过显著性检验及机器学习方法进行200次随机分组重复试验构建三种预测模型：（1）基于总训练集的临床-影像语义特征构建临床模型；（2）基于影像组学特征构建影像组学模型；（3）基于前两者构建临床-影像组学复合模型。使用受试者工作特征曲线（receiver operating characteristic curve，ROC）、Delong检验、决策曲线分析（decision curve analysis，DCA）评估上述各预测模型的预测效果，选取最优的模型构建列线图，以校准曲线（Hosmer-Lemeshow拟合优度检验）评估其拟合优度，同时在外部验证集中验证其效能。结果：总训练集中，性别、年龄、结节直径、密度、边缘、空洞征、分叶征、胸膜凹陷征等8个临床-影像语义特征具有显著性差异，用以构建临床模型；最佳影像组学特征，在每次分组试验中都经过机器学习重新筛选；同时以两部分特征共同构建复合模型。总训练集进行200次随机分组重复试验后，内部训练组和测试组的结果显示复合模型和影像组学模型的预测效能（AUC，area under curve）普遍高于临床模型。内部训练组和外部验证集的DCA显示，影像组学模型具有较稳定的净获益。在选取内部训练组中预测效果最好的影像组学模型（AUC：0.857）时，其效能显著优于此时的临床模型（AUC：0.782，p=0.003）。在内部测试组中，该影像组学模型预测效能（AUC：0.877）显著高于临床模型（AUC：0.785，p＜0.001），与复合模型（AUC：0.886）无显著差异（p=0.468）。在外部验证集中，此时影像组学模型预测效能（AUC：0.831）既显著高于临床模型（AUC：0.624，p＜0.001），亦显著高于复合模型（AUC：0.745，p=0.004）。构建该影像组学模型的列线图，校准曲线显示该模型拟合良好（p=0.269）。最终选择该影像组学预测模型作为鉴别肺结节良恶性的最优模型，共包含20个影像组学特征（准确度：0.773，灵敏度：0.752，特异度：0.792）。结论：相较于临床-影像语义特征，基于影像组学特征构建的预测模型，对肺结节的恶性风险具有更好的预测效能，为良恶性肺结节的鉴别和肺腺癌早期筛查提供了有临床价值的参考。第二部分构建基于影像组学的临床I期肺腺癌浸润性风险预测模型目的：基于影像组学技术和多中心临床数据，构建风险模型用以预测临床I期肺腺癌的浸润性，鉴别其中的原位腺癌（adenocarcinoma in situ，AIS）/微浸润腺癌（minimally invasive adenocarcinoma，MIA）和浸润性腺癌（invasive adenocarcinoma，IAC），探讨其在早期肺腺癌精准诊断中的应用价值，为制定合理的治疗方案提供参考依据。方法：本研究回顾性收集分析了国内7个医疗中心，自2015年10月至2020年9月间的506例临床I期肺腺癌患者的临床信息、影像数据、病理资料，病灶直径为5-20mm，其中2个医疗中心的377例病灶（AIS/MIA:IAC=169:208）作为总训练集，5个医疗中心的129例病灶（AIS/MIA:IAC=56:72）作为外部验证集。对所有病灶进行ROI图像分割后，提取1316个影像组学特征。将总训练集按3:1的比例随机分为模型训练组和内部测试组，通过显著性检验及机器学习方法进行200次随机分组重复试验构建三种预测模型：（1）基于总训练集的临床-影像语义特征构建临床模型；（2）基于影像组学特征构建影像组学模型；（3）基于前两者构建临床-影像组学复合模型。使用ROC曲线、Delong检验、DCA评估上述各预测模型的预测效果，选取最优的模型构建列线图，以校准曲线（Hosmer-Lemeshow拟合优度检验）评估其拟合优度，同时在外部验证集中验证其预测效能。结果：总训练集中，年龄、病灶直径、位置、密度、空洞征、分叶征、毛刺征、胸膜凹陷征等8个临床-影像语义特征具有显著性差异，用以构建临床模型；最佳影像组学特征，在每次分组试验中都经过机器学习重新筛选；同时以两部分特征共同构建复合模型。总训练集进行200次随机分组重复试验后，内部训练组和测试组的结果显示复合模型和影像组学模型的预测效能普遍高于临床模型。内部训练组和外部验证集的DCA显示，复合模型具有更好且稳定的净获益。在选取内部训练组中预测效果最好的复合模型（AUC：0.953）时，此时的影像组学模型亦达到最佳预测效果（AUC：0.932），且两者都显著优于此时的临床模型（AUC：0.863，p＜0.05）。在内部测试组中，该复合模型（AUC：0.935）和影像组学模型（AUC：0.933）的预测效能同时达到最佳，差异无显著性（p=0.921）。然而，仅复合模型AUC显著高于临床模型（AUC：0.888，p=0.009），而影像组学模型则与临床模型无显著差异（p=0.069）。在外部验证集中，复合模型（AUC：0.824）、影像组学模型（AUC：0.816）、临床模型（AUC：0.799）相互之间无显著性差异。进一步构建该复合模型的列线图，校准曲线显示该模型拟合良好（p=0.359）。最终选择该复合预测模型作为评价临床I期肺腺癌浸润性风险的最优模型，该模型共包含8个临床-影像语义特征和12个影像组学特征（准确度：0.872，灵敏度：0.841，特异度：0.897）。结论：对于临床I期肺腺癌，采用影像组学方法能够较好地辨别AIS/MIS和IAC，从而评估病灶的浸润程度，但同时也不能忽略临床-影像语义特征的作用。而基于两者联合构建的复合预测模型，能为评估早期肺癌的浸润性风险提供一种新的辅助诊断方法。第三部分构建基于影像组学的临床I期肺浸润性腺癌淋巴结转移风险预测模型目的：基于影像组学和多中心临床数据，构建风险模型用以预测临床I期肺浸润性腺癌是否发生淋巴结转移，探讨其在早期肺腺癌的综合评估诊断和制定个体化治疗方案中的应用价值。方法：本研究回顾性收集分析了国内5个医疗中心，自2015年10月至2020年9月间的552例临床I期肺腺癌患者的临床信息、影像数据、病理资料，病灶的影像直径为10-30mm。其中1个医疗中心的452例病灶（未转移N0:转移N+=236:216）作为总训练集，4个医疗中心的100例病灶（N0:N+=57:43）作为外部验证集。对所有病灶进行ROI图像分割后，提取1316个影像组学特征。将总训练集按3:1的比例随机分为模型训练组和内部测试组，通过显著性检验及机器学习方法进行200次随机分组重复试验构建三种预测模型：（1）基于总训练集的临床-影像语义特征构建临床模型；（2）基于影像组学特征构建影像组学模型；（3）基于前两者构建临床-影像组学复合模型。使用ROC曲线、Delong检验、DCA评估上述各预测模型的预测效果，选取最优的模型构建列线图，以校准曲线（Hosmer-Lemeshow拟合优度检验）评估其拟合优度，同时在外部验证组中验证其预测效能。结果：总训练集中，密度、边缘、分叶征、毛刺征等4个特征具有显著性差异，用以构建临床模型；最佳影像组学特征，在每次分组试验中都经过机器学习重新筛选；同时以两部分特征共同构建复合模型。总训练集进行200次随机重复试验后，内部训练组和测试组的结果显示复合模型和影像组学模型的预测效能普遍高于临床模型。内部训练组和外部验证集的DCA显示，影像组学模型具有更好且稳定的净获益。在选取内部训练组中预测效果最好的影像组学模型（AUC：0.887）时，此时对应的复合模型亦达到最佳预测效果（AUC：0.892），虽然两者差异无显著性（p=0.978），但都显著优于此时的临床模型（AUC：0.789，p＜0.001）。在内部测试组中，该影像组学模型（AUC：0.882）和对应复合模型（AUC：0.876）的预测效能同时达到最佳，差异无显著性（p=0.402），但仅影像组学模型效能显著优于此时的临床模型（AUC：0.813，p＜0.05）。在外部验证集中，影像组学模型（AUC：0.787）和复合模型AUC：0.787）的效能仍高于临床模型（AUC：0.694），但无显著性差异。进一步构建该影像组学模型的列线图，校准曲线显示该模型拟合良好（p=0.545）。最终选择该影像组学预测模型，作为评价临床I期肺浸润性腺癌淋巴结转移风险的最优模型，该模型共包含12个影像组学特征（准确度：0.799，灵敏度：0.751，特异度：0.852）。结论：相较于临床-影像语义特征，基于影像组学特征构建的预测模型，对预测临床I期肺浸润性腺癌是否发生淋巴结转移具有较好的预测价值，为早期肺癌淋巴结转移风险的诊断和评估提供了有临床意义的参考。第四部分肺结节良恶性、浸润性、淋巴结转移风险的分级评价及临床综合辅助诊断系统（MINs）的建立目的：通过已构建好的肺结节良恶性、浸润性、淋巴结转移风险预测模型，设立跨区域多中心的基线患者数据库，进行风险分级评价，探索建立一种全新的肺结节临床综合辅助诊断系统。方法：首先使用已构建好的肺结节良恶性、浸润性、淋巴结转移风险预测模型，对前期回顾性收集的国内7个医疗中心，自2015年10月至2020年9月间的所有1701例肺结节（直径5-30mm）进行风险率（hazard rate，HR）及风险优势比（odds ratio，OR）计算，构建三种风险的基线对照数据库。其中用于良恶性风险预测共1701例（良性:恶性=607:1094），用于浸润性风险预测共1082例（AIS/MIA:IAC=275:807），用于淋巴结转移风险预测共1071例（N0:N+=812:259）。综合每个基线数据库ROC曲线的最大约登指数以及数据库中病灶的HR和阳性事件发生率等指标，划分出3个风险等级：高危、中危、低危。最终建立分阶段流程化预测肺结节良恶性、浸润性、淋巴结转移风险的临床综合辅助诊断系统——MIN系统（Malignancy-Invasiveness-Node metastasis system, MINs）。基本流程为：（1）计算肺结节的一阶-恶性风险HR，得出风险分级；（2）每当触发“中高危”风险时，则进入下一阶段的风险评价，直至结束输出预设结果；（3）当触发“低危”风险时，则结束评价，输出预设结果。依次经过三个阶段，最终得到4种预测诊断结果：良性，AIS/MIA，N0-IAC，N+-IAC。回顾性收集4个医疗中心（3个为数据库内中心，1个为数据库外中心）2021年10月至2022年9月间的120例肺结节作为外部验证集。首先验证三种风险分级体系的准确度、灵敏度、特异度，再验证MIN系统预测肺结节临床诊断的效能。结果：（1）肺结节-恶性（腺癌）风险分级：恶性风险预测模型在基线数据库和外部验证集中的AUC分别为0.847和0.836。恶性风险分级中，对于恶性结节，灵敏度为65.56%-93.33%，高危诊断准确度达92.19%；对于良性结节，特异度为53.33%-83.33%，低危诊断准确度达72.73%。（2）临床I期肺腺癌-IAC风险分级：IAC风险预测模型在基线数据库和外部验证集中的AUC分别为0.880和0.892。IAC风险分级中，对于IAC，灵敏度为78.79%-93.94%，高危诊断准确度达96.30%；对于AIS/MIA，特异度为54.17%-91.67%，低危诊断准确度达76.47%。（3）临床I期肺腺癌-N+风险分级：N+风险预测模型在基线数据库和外部验证集中的AUC分别为0.859和0.874。N+风险分级中，对于N+，灵敏度为84.21%-94.74%，高危诊断准确度达到51.61%；对于N0，特异度为73.24%-78.87%，低危诊断准确度达到98.11%。（4）MIN系统的预测诊断效果： MIN系统的一阶诊断（良恶性）总正确率为83.33%：结节判为“良性”的正确率为72.73%，判为“恶性（腺癌）”的正确率为85.71%。二阶诊断（浸润性）总正确率为70.83%：结节判为“AIS/MIA”的正确率为77.78%，判为“IAC”的正确率为70%。三阶诊断（淋巴结转移）总正确率为64.17%：结节判为“N0-IAC”的正确率为81.40%，判为“N+-IAC”的正确率为35.13%。结论：本研究基于影像组学技术，最终创造性地建立了能够流程化预测肺结节三阶段风险的MIN系统。这套全新的临床综合辅助诊断系统，为肺结节临床诊断的预测和综合治疗的制定，提供了一种极具应用价值的技术参考手段。﹀
论文文摘（外文）：	︿ Part 1 Constructing a Radiomics-Based Risk Prediction Model for Benign Pulmonary Nodules and Lung Adenocarcinoma Objective: Using radiomics technology and integrating multicenter clinical data, this study constructed a risk prediction model to differentiate between benign lesions and lung adenocarcinoma (LUAD) in pulmonary nodules. We explored its application value in early screening for LUAD. Methods: This study retrospectively collected and analyzed clinical data, imaging data, and pathological data of 1313 patients with pulmonary nodules from 7 medical centers in China between October 2015 and September 2020. Among them, 1211 cases of pulmonary nodules (benign:LUAD = 570:641) from 4 medical centers were used as the total training group, and 102 cases of pulmonary nodules (benign:LUAD = 48:54) from 3 medical centers were used as the external validation group. After image segmentation of the regions of interest (ROI) of all nodules, 1316 radiomics features were extracted. The total training group was randomly divided into model training and testing sets in a 3:1 ratio. Three predictive models were constructed through 200 repetitions of random grouping using significance testing and machine learning methods: (1) a clinical model based on clinical-imaging semantic features from the total training set; (2) a radiomics model based on radiomics features; and (3) a clinical-radiomics combined model based on the previous two models. The predictive performance of these models was evaluated using receiver operating characteristic curves (ROC), area under curve (AUC), Delong's test, and decision curve analysis (DCA). The best model was selected based on these evaluations to construct a calibration curve to assess its goodness of fit (Hosmer-Lemeshow test). Finally, the performance of this selected model was validated in the external validation group. Results: In the total training group, 8 clinical-imaging semantic features including gender, age, maximum nodule diameter, density, margin, cavity sign, lobulation sign, and pleural indentation sign exhibited significant differences, forming the basis for constructing the clinical model. The optimal radiomics features were reselected through machine learning in each random subgroup experiment. A combined model was then built using both sets of features. After 200 repetitions of random grouping in the total training group, results from the training and testing sets showed that the combined model and radiomics model generally had higher predictive performance compared to the clinical model. DCA for the training set and the external validation group showed that the radiomics model had a more stable net benefit. The best-performing radiomics model in the internal training set (AUC: 0.857) significantly outperformed the clinical model (AUC: 0.782, p=0.003). However, in the internal testing set, the radiomics model's predictive performance (AUC: 0.877) was significantly higher than the clinical model (AUC: 0.785, p<0.001), with no significant difference compared to the combined model (AUC: 0.886, p=0.468). In the external validation group, the radiomics model's predictive performance (AUC: 0.831) was significantly higher than the clinical model (AUC: 0.624, p<0.001) and also significantly higher than the combined model (AUC: 0.745, p=0.004). The calibration curve of the constructed radiomics model showed a good fit (p=0.269). The final choice was the radiomics model, selected as the optimal model for distinguishing between benign and malignant pulmonary nodules. This model comprises a total of 20 radiomics features (accuracy: 0.773, sensitivity: 0.752, specificity: 0.792). Conclusion: The predictive model based on radiomics features demonstrates superior performance in distinguishing between benign pulmonary nodules and LUAD. This offers valuable clinical insights for the diagnosis of pulmonary nodules and early screening of LUAD. Part 2 Constructing a Radiomics-Based Risk Prediction Model for the Invasiveness of Clinical Stage I Lung Adenocarcinoma Objective: Constructing predictive models using radiomics technology and integrating multicenter clinical data to differentiate between adenocarcinoma in situ (AIS) / minimally invasive adenocarcinoma (MIA) and invasive adenocarcinoma (IAC) in clinical stage I LUAD. This study explores its application value in the precise diagnosis of early LUAD, providing a reference basis for formulating reasonable treatment plans. Methods: This study retrospectively collected and analyzed clinical data, imaging materials, and pathological data of 506 clinical stage I LUAD patients from 7 medical centers in China between October 2015 and September 2020. The lesions had a maximum diameter of 5-20mm. Among them, 377 lesions from 2 medical centers (AIS/MIA:IAC = 169:208) were used as the total training group, while 129 lesions from 5 medical centers (AIS/MIA:IAC = 56:72) were used as the external validation group. After ROI segmentation of all lesions, 1316 radiomics features were extracted. The total training group was randomly divided into training and testing sets at a 3:1 ratio, and 3 predictive models were constructed through significant tests and machine learning methods in 200 random group repetitions: (1) a clinical model based on clinical-imaging semantic features from the total training group, (2) a radiomics model based on radiomics features, and (3) a composite model combining the first two. The predictive performance of each model was evaluated using ROC, Delong tests, and DCA, and the best model was selected for constructing calibration curves to assess its goodness of fit (Hosmer-Lemeshow test). Finally, the predictive performance of the selected model was validated in the external validation group. Results: In the total training group, 8 clinical-imaging semantic features including age, lesion diameter, location, density, cavity sign, lobulation sign, spiculation sign, and pleural indentation sign showed significant differences, used to construct the clinical model. The optimal radiomics features were reselected through machine learning in each random subgroup experiment, and a combined model was built using both sets of features. After 200 random group repetitions in the total training group, the results in the internal training and testing sets showed that the combined model and radiomics model had generally higher predictive performance than the clinical model. DCA for the training set and the validation group indicated that the combined model had better and more stable net benefits. When selecting the best-performing combined model (AUC: 0.953) from the internal training set, the radiomics model also achieved optimal predictive performance (AUC: 0.932). Both were significantly better than the clinical model (AUC: 0.863, p<0.05). In the internal testing set, both the combined model (AUC: 0.935) and the radiomics model (AUC: 0.933) showed the best predictive performance, with no significant difference between them (p=0.921). However, only the combined model AUC was significantly higher than the clinical model (AUC: 0.888, p=0.009), while the radiomics model showed no significant difference from the clinical model (p=0.069). In the external validation group, there were no significant differences among the combined model (AUC: 0.824), radiomics model (AUC: 0.816), and clinical model (AUC: 0.799). Further calibration curve construction for the combined model showed good fit (p=0.359). The combined prediction model was ultimately selected as the optimal model for assessing the infiltrative nature of clinical stage I LUAD. This model comprises 8 clinical-imaging semantic features and 12 radiomic features (accuracy: 0.872, sensitivity: 0.841, specificity: 0.897). Conclusion: Radiomics effectively predicts lesion invasiveness in clinical stage I LUAD, yet the role of clinical-imaging features remains crucial. The combined predictive model combining both provides a valuable reference for early lung cancer diagnosis and risk assessment. Part 3 Constructing a Radiomics-Based Risk Prediction Model for the Lymph Node Metastasis in Clinical Stage I Invasive Lung Adenocarcinoma Objective: Using radiomics and multicenter clinical data, this study constructed a risk model to predict the risk of lymph node metastasis in clinical stage I LUAD. We explored its application value in accurate diagnosis and comprehensive treatment of LUAD. Methods: This study retrospectively analyzed clinical data, imaging, and pathology information from 552 patients with stage I invasive LUAD across 5 medical centers in China from October 2015 to September 2020, with lesion sizes ranging from 10 to 30 mm. Among these, 452 lesions from one center (N0:N+ = 236:216) were used as the total training group, while 100 lesions from 4 centers (N0:N+ = 57:43) were used as the external validation group. After ROI segmentation of all lesions, 1316 radiomics features were extracted. The total training group was randomly divided into internal training and testing sets in a 3:1 ratio, and 3 prediction models were constructed through significant testing and machine learning in 200 random grouping repetitions: (1) a clinical model based on clinical-imaging semantic features of the total training group; (2) a radiomics model based on radiomics features; and (3) a composite model combining the previous two. The predictive performance of these models was evaluated using ROC, Delong tests, and DCA, selecting the optimal model for constructing calibration curves (Hosmer-Lemeshow test) to assess their fit and validating their predictive performance in the external validation group. Results: In the total training set, 4 features including density, margin, lobulation sign, and spiculation sign showed significant differences, which were used to construct the clinical model. The optimal radiomics features were reselected through machine learning in each random subgroup experiment, and a combindd model was built using both sets of features. After 200 random repetitions in the total training group, the internal training and testing sets showed that the combined model and radiomics model generally had higher predictive performance than the clinical model. The DCA for the training set and the validation group demonstrated that the radiomics model had a better and more stable net benefit. When selecting the best-performing radiomics model in the internal training set (AUC: 0.887), the corresponding combined model also achieved the best predictive performance (AUC: 0.892), although the difference between the two was not significant (p=0.978). However, both models were significantly better than the clinical model (AUC: 0.789, p<0.001). In the testing set, the radiomics model (AUC: 0.882) and the corresponding combined model (AUC: 0.876) both showed optimal predictive performance, with no significant difference between them (p=0.402). Still, only the radiomics model's performance was significantly better than the clinical model (AUC: 0.813, p<0.05). In the external validation group, the performance of both the radiomics model (AUC: 0.787) and the combined model (AUC: 0.787) remained higher than that of the clinical model (AUC: 0.694), but there was no significant difference between them. The calibration curve for the radiomics model showed a good fit (p=0.545). The radiomics prediction model was ultimately selected as the optimal model for assessing the risk of lymph node metastasis in clinical stage I LUAD. This model included 12 radiomics features (accuracy: 0.799, sensitivity: 0.751, specificity: 0.852) Conclusion: Radiomics-based predictive models show superior value in predicting lymph node metastasis in clinical stage I invasive LUAD compared to clinical-imaging semantic features. This offers valuable clinical insight into assessing metastasis risk in early-stage lung cancer. Part 4 Establishing a Clinical Auxiliary Diagnostic System (MINs) for Pulmonary Nodules with Risk Stratification Objective: Using established predictive models for pulmonary nodule malignancy, invasiveness, and risk of lymph node metastasis, we create a cross-regional, multi-center baseline database for risk stratification. This exploration aims to establish a novel comprehensive clinical auxiliary diagnostic system for pulmonary nodules. Results: Using established predictive models for pulmonary nodule malignancy, invasiveness, and lymph node metastasis risk, we analyzed 1701 cases of nodules (diameter 5-30mm) from 7 medical centers in China between October 2015 and September 2020. This led to the creation of baseline databases for each risk type. The cases included 1701 for malignancy risk prediction (benign:malignant=607:1094), 1082 for invasiveness risk prediction (AIS/MIA:IAC=275:807), and 1071 for lymph node metastasis risk prediction (N0:N+=812:259). By evaluating ROC curves and other indicators, we categorized risks into high, moderate, and low-risk levels. Finally, we established a streamlined comprehensive clinical diagnostic system: MINs (Malignancy-Invasiveness-Node Metastasis system) for predicting the malignancy, invasiveness, and lymph node metastasis risks of pulmonary nodules. The process involves: (1) calculating the first-order malignant risk HR of pulmonary nodules to determine risk levels; (2) triggering a risk assessment for "moderate to high risk" cases, progressing through subsequent risk evaluations until a final output is reached; (3) ending the evaluation and outputting the predetermined result for cases categorized as "low risk." Through these stages, we obtain four predictive diagnostic results: Benign, AIS/MIA, N0-IAC, N+-IAC. A retrospective collection of 120 cases of pulmonary nodules from 4 medical centers (3 internal centers and 1 absolute external center) from October 2021 to September 2022 was used as an external validation set. Firstly, the accuracy, sensitivity, and specificity of the three risk grading systems were validated, followed by assessing the efficacy of the MINs in predicting the clinical diagnosis of pulmonary nodules. Results: (1) Risk Stratification for Pulmonary Nodules - Malignancy (LUAD): The AUC of the malignancy risk prediction model in the baseline database and external validation set were 0.847 and 0.836, respectively. In the malignancy risk stratification, the sensitivity for malignant nodules ranged from 65.56% to 93.33%, with an accuracy of 92.19% for high-risk diagnosis. For benign nodules, the specificity ranged from 53.33% to 83.33%, with an accuracy of 72.73% for low-risk diagnosis. (2) Risk Stratification for Clinical Stage I LUAD - IAC: The AUC of the IAC risk prediction model in the baseline database and external validation set were 0.880 and 0.892, respectively. In the IAC risk stratification, the sensitivity for IAC ranged from 78.79% to 93.94%, with an accuracy of 96.30% for high-risk diagnosis. For AIS/MIA, the specificity ranged from 54.17% to 91.67%, with an accuracy of 76.47% for low-risk diagnosis. (3) Risk Stratification for Clinical Stage I LUAD - N+: The AUC of the N+ risk prediction model in the baseline database and external validation set were 0.859 and 0.874, respectively. In the N+ risk stratification, the sensitivity for N+ ranged from 84.21% to 94.74%, with an accuracy of 51.61% for high-risk diagnosis. For N0, the specificity ranged from 73.24% to 78.87%, with an accuracy of 98.11% for low-risk diagnosis. (4) Diagnostic Performance of the MINs: The first-order diagnosis accuracy of the MIN system (Malignancy) was 83.33%: the accuracy for nodules classified as "Benign" is 72.73%, while the accuracy for those classified as "Malignancy (LUAD)" was 85.71%. The second-order diagnosis accuracy (Invasiveness) is 70.83%: the accuracy for nodules classified as "AIS/MIA" was 77.78%, and for those classified as "IAC" was 70%. The third-order diagnosis accuracy (lymph Node metastasis) was 64.17%: the accuracy for nodules classified as "N0-IAC" was 81.40%, and for those classified as "N+-IAC" was 35.13%. Conclusion: This study, based on radiomics technology, creatively established the MIN system, which can predict the three-stage risk of pulmonary nodules in a streamlined manner. This new comprehensive clinical auxiliary diagnostic system provides valuable technical support for predicting and treating pulmonary nodules, aiding in clinical diagnosis and treatment planning. ﹀
开放日期：	2024-06-03