查看论文信息

免费浏览

查看论文信息

论文题名(中文)：	基于自动化机器学习的临床预测模型构建研究：以脓毒症患者死亡风险预测为例
姓名：	陈浩然
论文语种：	chi
学位：	硕士
学位类型：	学术学位
学校：	北京协和医学院
院系：	北京协和医学院医学信息研究所
专业：	公共卫生与预防医学-流行病与卫生统计学
指导教师姓名：	李姣
论文完成日期：	2025-04-30
论文题名(外文)：	Development of Clinical Prediction Model Based on Automated Machine Learning: An Example of Mortality Risk Prediction in Patients with Sepsis
关键词(中文)：	临床预测模型机器学习建模自动化机器学习脓毒症
关键词(外文)：	Clinical prediction model machine learning modeling automated machine learning sepsis
论文文摘（中文）：	︿临床预测模型是提升疾病诊疗效果的关键工具，传统线性回归模型因难以处理高维、非线性临床数据，预测精度受限。机器学习虽能捕捉复杂模式、提升准确性，但其建模流程（特征选择、模型调优等）需专业知识，普及门槛较高。自动化机器学习（AutoML）通过自动化数据预处理、特征工程、模型选择及调优全流程，降低人工干预成本并提升效率，但其与传统方法的全面对比研究及临床原型系统开发尚处空白，尤其在自动化特征选择、模型评估等环节缺乏系统方案。本研究以脓毒症患者死亡风险预测为例，开展传统方法与 AutoML 的综合性对比研究，以填补上述空白。（1）首先，本研究整合了来自MIMIC-III CDC、MIMIC-IV和eICU-CRD三个公开数据库的脓毒症患者临床数据，最终构建了一个涵盖50139位符合“脓毒症3.0”诊断标准的脓毒症患者的建模数据库。该数据库包含89个临床特征，包括患者基本信息、实验室检查结果和临床症状等，脓毒症患者的住院期间死亡风险高达22.2%。（2）其次，研究分别以MIMIC-IV的脓毒症患者数据为模型开发集和内部验证集，MIMIC-III CDC的脓毒症患者数据为时间尺度的验证集，以及eICU-CRD的脓毒症患者数据为地域尺度的验证集，分别基于传统方法构建和基于自动化机器学习构建脓毒症患者院内死亡风险预测模型。在基于传统方法构建的过程中，综合考虑了14种建模算法、4种缺失值插补方法、1种独热编码方式、4种标准化方法、6种特征提取技术和5种样本平衡化方法的组合，总共构建了2423个预测模型。与此同时，基于充分调研和分析，选用了8种普遍应用的自动化机器学习算法和1种表格基础模型，基于这些算法构建了相应的预测模型。接下来，本研究从准确度、校准度、临床收益和模型部署性等多个维度对基于传统方法、基于自动化机器学习和表格基础模型构建的临床预测模型进行了全面对比。准确度方面，采用AUC、ROC曲线和F1分数等指标量化了两类模型的预测性能；在校准度方面，通过校准曲线评估模型输出概率与实际发生率的匹配程度；临床收益方面，通过决策曲线分析评估模型在不同决策阈值下的临床效用；在模型部署性方面，考虑了列线图和计算器等常见展示方式的适用性。研究结果显示，基于传统方法构建的模型中，采用均值插补和Robust标准化的XGBoost算法在各项指标上表现优异。而基于AutoML构建的模型，特别是使用Autogluon算法的模型，也在所有评价维度上表现出了与XGBoost相当的性能，并且在开发时间上大幅缩短，极大提高了模型构建效率。（3）最后，基于最优的自动化机器学习算法，本研究开发了一个临床预测模型构建与评估的原型系统。该系统能够自动完成数据预处理、特征选择、模型训练、优化和解读等全过程。为验证该系统的实际应用潜力，本研究通过临床住院结局预测、横断面关联分析和疾病病因分型等三个任务的案例研究，进一步证明了该原型系统在临床应用中的潜力和价值。综上所述，本研究以脓毒症死亡风险预测为例，通过准确度、校准度、临床收益及部署效率等多维度验证发现，基于 AutoML 构建的模型不仅预测性能与传统方法相当，且开发效率显著提升。AutoML 系统原型的构建更降低了技术门槛，使临床医生可独立完成模型开发优化。研究表明，自动化机器学习在临床医学中具有广阔应用前景，其技术路径可为其他临床预测任务提供参考，推动 AI 在医疗场景的实际落地。﹀
论文文摘（外文）：	︿ Clinical prediction models are key tools for enhancing disease diagnosis and treatment outcomes. Traditional linear regression models are limited in prediction accuracy due to their difficulty in handling high-dimensional, nonlinear clinical data. Although machine learning can capture complex patterns and improve accuracy, its modeling process—including feature selection and model tuning—requires specialized expertise, creating a high barrier to widespread adoption. Automated machine learning (AutoML) reduces manual intervention costs and improves efficiency by automating the entire workflow, from data preprocessing and feature engineering to model selection and tuning. However, comprehensive comparative studies between traditional methods and AutoML, as well as the development of clinical prototype systems, remain unexplored, particularly in areas such as automated feature selection and model evaluation, where systematic solutions are lacking. This study addresses these gaps by conducting a comprehensive comparative analysis between traditional methods and AutoML, using sepsis patient mortality risk prediction as a case example. To address the challenges in constructing mortality risk prediction models for sepsis patients, this study conducted a comprehensive comparative analysis using sepsis mortality risk as a case study. 1. First, this study integrated clinical data from three publicly available databases: MIMIC-III CDC, MIMIC-IV, and eICU-CRD. A final modeling dataset was constructed, which included data from 50,139 sepsis patients meeting the "Sepsis 3.0" diagnostic criteria. The dataset comprised 89 clinical features, including patient demographics, laboratory results, and clinical symptoms, with an in-hospital mortality rate of 22.2% for sepsis patients. 2. Second, the study used MIMIC-IV sepsis data as the model development and internal validation set, MIMIC-III CDC sepsis data as the time-based external validation set, and MIMIC-IV CDC sepsis data as the geographical external validation set. Both manual and AutoML-based models for sepsis mortality risk prediction were constructed. In the manual modeling process, combinations of 14 modeling algorithms, 4 missing data imputation methods, 1 one-hot encoding technique, 4 normalization methods, 6 feature extraction techniques, and 5 sample balancing methods were considered, resulting in approximately 2,423 prediction models. Additionally, 8 widely-used AutoML algorithms and 1 tabular foundation model were selected based on a thorough review of the literature to construct corresponding prediction models. The models were evaluated across multiple dimensions, including accuracy, calibration, clinical utility, and deployment feasibility. Accuracy was quantified using various metrics such as AUC, ROC curves, and F1 scores; calibration was assessed through calibration curves, which evaluate the alignment between predicted probabilities and actual outcomes; clinical utility was analyzed using decision curve analysis to assess the model’s performance under different decision thresholds; and deployment feasibility was evaluated through common display methods such as nomograms and calculators.The study found that the manually constructed model using XGBoost with mean imputation and robust scaling demonstrated superior performance across all evaluation metrics. Similarly, the AutoML-based models, particularly those built using the Autogluon algorithm, achieved comparable performance to XGBoost while significantly reducing development time, thus greatly improving model construction efficiency. 3. Finally, based on the optimal AutoML algorithm, this study developed a prototype system for clinical prediction model construction and evaluation. The system can automate the entire process, including data preprocessing, feature selection, model training, optimization and interpretation. To validate the system's practical potential, case studies were conducted on tasks such as hospital outcome prediction, cross-sectional analysis, and disease classification, demonstrating the prototype system’s clinical application potential and value. In conclusion, this study takes sepsis mortality risk prediction as an example and demonstrates through multi-dimensional validation—including accuracy, calibration, clinical utility, and deployment efficiency—that AutoML-based models achieve comparable predictive performance to traditional methods while significantly improving development efficiency. The construction of an AutoML system prototype further lowers technical barriers, enabling clinicians to independently develop and optimize models. The findings suggest that automated machine learning holds broad application prospects in clinical medicine, and its technical framework can serve as a reference for other clinical prediction tasks, promoting the real-world implementation of AI in healthcare. ﹀
开放日期：	2025-06-12

附件下载