查看论文信息

免费浏览

查看论文信息

论文题名(中文)：	心脏植入式电子设备感染风险预测模型的构建与验证
姓名：	张晓欣
论文语种：	chi
学位：	硕士
学位类型：	专业学位
学校：	北京协和医学院
院系：	北京协和医学院护理学院
专业：	护理学-护理学
指导教师姓名：	康晓凤
论文完成日期：	2025-05-14
论文题名(外文)：	Development and validation of risk prediction model for cardiovascular implantable electronic device infections
关键词(中文)：	心脏植入式电子设备感染风险预测模型机器学习
关键词(外文)：	Cardiovascular Implantable Electronic Device Infection Risk Prediction Model Machine Learning
论文文摘（中文）：	︿研究背景：心脏植入式电子设备（Cardiovascular Implantable Electronic Device, CIED）植入手术是治疗难治性心力衰竭与恶性心律失常的重要手段，能显著改善患者生活质量并提高其生存率。随着CIED的广泛应用、我国人口老龄化加剧以及植入设备复杂性和患者合并症等因素的叠加影响，CIED感染的发生率呈明显上升趋势，已经成为植入术后最常见且最严重的并发症之一，不仅显著增加患者的住院时间、治疗费用和死亡风险，还会给患者及其家庭带来了严峻的经济负担和心理压力。因此，分析CIED感染相关危险因素，挖掘核心特征变量，构建CIED感染风险预测模型，对于早期筛查高危患者，优化临床管理策略以及改善患者预后具有重要的临床意义。研究目的： 1.调查我国CIED感染的发生现况，通过Lasso回归筛选关键特征变量，运用逻辑回归、支持向量机和多层感知机三项机器学习算法构建CIED感染风险预测模型。 2.评估并比较三项机器学习模型的预测性能，通过SHAP分析可视化特征变量对CIED感染结局预测的重要程度。研究方法： 1.回顾性收集2018年1月1日至2021年5月31日，于北京市某三甲医院心律失常病房行CIED手术的患者临床资料，通过数据预处理、Lasso回归，筛选并提取构建模型所需的特征变量。 2.将纳入研究的样本数据按照8:2的比例随机划分为训练集和验证集。使用逻辑回归、支持向量机与多层感知机3项机器学习算法开展CIED感染风险预测模型的构建与验证。以ROC曲线下面积AUC值、准确率、敏感度、特异度、F₁分数等为指标对模型的预测性能进行评估，并进行SHAP分析。 3.基于SHAP分析结果，通过摘要图与重要性排序图，对三项机器学习模型的特征变量的贡献程度进行可视化呈现，并探究其临床指导价值。研究结果： 1.本研究共纳入2530例患者，CIED感染总发生率为3.44%。对纳入的数据集按照8:2比例随机划分，80%数据集作为训练集，20%数据集作为验证集。训练集感染发生率为3.36%，验证集感染发生率为4.15%。 2.运用Lasso回归筛选出12项特征变量用于机器学习预测模型的构建。三项机器学习模型在训练集和验证集的性能表现差异较小且准确率均较高。其中，逻辑回归模型的性能最稳健，训练集AUC为0.957，验证集AUC为0.966；多层感知机模型的综合性能最佳，训练集AUC为0.976，验证集AUC为0.970；支持向量机模型在验证集上泛化能力较强，训练集AUC为0.946，验证集AUC为0.987。 3.通过SHAP分析，对三项机器学习模型中的特征变量进行可视化呈现。结果显示囊袋血肿史、高血压、单核细胞计数、白蛋白水平、长期使用抗凝血类药、切口愈合不佳、三腔永久起搏器、电极拔除或导线拔除等变量在三项模型中均表现出贡献价值。研究结论： 1.本研究运用Lasso回归和三项机器学习算法构建的CIED感染风险预测模型各有优势与不足。逻辑回归模型具有稳定预测性能和良好可解释性，且SHAP值分布合理；多层感知机模型的综合预测性能最佳，展示出非线性模型在捕捉复杂关系方面的独特优势；支持向量机模型则在验证集上表现较好。 2.SHAP分析结果显示，囊袋血肿史、高血压、单核细胞计数、白蛋白水平、长期使用抗凝血类药、切口愈合不佳、三腔永久起搏器、电极拔除或导线拔除是三项机器学习模型共同的重要特征变量，对于医护人员实现CIED感染高风险患者的早期识别与筛查具有一定临床指导意义。﹀
论文文摘（外文）：	︿ Background: Cardiovascular Implantable Electronic Device (CIED) implantation is an important treatment for refractory heart failure and malignant arrhythmia, which can significantly improve the quality of life of patients and increase their survival rate. With the wide application of CIEDs, the aging of China's population, the complexity of implanted devices and patient comorbidities, the incidence of CIED infections has increased significantly, and has become one of the most common and serious complications after implantation, which not only significantly increases the length of hospitalization, treatment costs, and the risk of death, but also imposes a severe economic burden and psychological pressure on patients and their families. psychological pressure. Therefore, analyzing the risk factors related to CIED infection, exploring the core characteristic variables, and constructing a CIED infection risk prediction model are of great clinical significance for the early screening of high-risk patients, optimizing the clinical management strategy, and improving the prognosis of patients. Objective: 1.To explore the prevalence of CIED infections in China, identify key feature variables using Lasso regression, and develop a CIED infection risk prediction model by integrating three machine learning algorithms—logistic regression, support vector machines, and multilayer perceptrons. 2.To evaluate and compare the predictive performance of the three machine learning models, and visualize the importance of the feature variables on the outcome prediction of CIED infection by SHAP analysis. Methods: 1.Retrospectively collected clinical data of patients undergoing cardiac implantable electronic device (CIED) surgery in the arrhythmia ward of a tertiary - level hospital in Beijing from January 1, 2018, to May 31, 2021. Features variables needed for model construction were screened and extracted via data preprocessing and Lasso regression. 2.The sample data included in the study were randomly divided into a training set and a validation set according to the ratio of 8:2. The construction and validation of the CIED infection risk prediction model were carried out using three machine learning algorithms, namely, logistic regression, support vector machine and multilayer perceptual machine. The predictive performance of the model was evaluated using the AUC value of the area under the ROC curve, accuracy, sensitivity, specificity, and F1 score as indicators, and SHAP analysis was performed. 3.Based on the results of SHAP analysis, the degree of contribution of the feature variables of the three machine learning models was visually presented through summary plots and importance ranking plots, and their clinical guidance value was explored. Results: 1.A total of 2,530 patients were included in this study. The overall incidence of CIED infection was 3.44%. The datasets were randomly divided into training (80%) and validation (20%) sets. The infection incidence was 3.36% in the training set and 4.15% in the validation set. 2.Lasso regression identified 12 feature variables for constructing machine - learning - based prediction models. The performances of the three models (logistic regression, multilayer perceptron, and support vector machine) showed minor differences but high accuracy. The logistic regression model exhibited the most stable performance (AUC: 0.957 for training, 0.966 for validation). The multilayer perceptron model showed the best comprehensive performance (AUC: 0.976 for training, 0.970 for validation). The support vector machine model demonstrated strong generalization ability on the validation set (AUC: 0.946 for training, 0.987 for validation). 3.SHAP analysis was conducted to visualize features variables in the three machine learning models. Results showed that variables such as pocket hematoma, hypertension, monocyte count, albumin level, long-term use of anticoagulants, poor wound healing, triple-chamber permanent pacemaker, and lead or electrode removal demonstrated showed contributing values in all three models. Conclusion: 1.The CIED infection risk prediction models constructed in this study using Lasso regression and three machine learning algorithms have their own strengths and weaknesses. The logistic regression model has stable prediction performance and good interpretability, and the SHAP values are reasonably distributed; the multilayer perceptual machine model has the best overall prediction performance, demonstrating the unique advantage of nonlinear models in capturing complex relationships; and the support vector machine model performs better on the validation set. 2. The results of SHAP analysis showed that history of pocket hematoma, hypertension, monocyte count, albumin level, long-term use of anticoagulant analogs, poor wound healing, triple-chamber permanent pacemaker, lead or electrode removal were the important feature variables common to all three machine-learning models, which is of clinical guidance for healthcare professionals to achieve early identification and screening of patients with high risk of CIED infection. ﹀
开放日期：	2025-06-05

附件下载