论文题名(中文): | 基于线粒体相关基因特征的乳腺癌患者预后预测与风险分层;探索血浆蛋白对乳腺癌易感性:基于蛋白组学与转录组学的因果关联研究 |
姓名: | |
论文语种: | chi |
学位: | 博士 |
学位类型: | 专业学位 |
学校: | 北京协和医学院 |
院系: | |
专业: | |
指导教师姓名: | |
校内导师组成员姓名(逗号分隔): | |
论文完成日期: | 2024-05-01 |
论文题名(外文): | Part Ⅰ: Prognosis prediction and risk stratification of breast cancer patients based on a mitochondria‐related gene signature;完成日期: 2024年4月 Part Ⅱ: Elucidating the susceptibility to breast cancer: an in-depth proteomic and transcriptomic investigation into novel potential plasma protein biomarkers |
关键词(中文): | |
关键词(外文): | Breast Cancer Mitochondria-related Gene risk biomarker Prognosis Metabolic Reprogramming Proteome-Wide Association Study Transcriptome-Wide Association Study Plasma Proteins Mendelian Randomization |
论文文摘(中文): |
第一部分:基于线粒体相关基因特征的乳腺癌患者预后预测与风险分层 中文摘要 背景:在全球范围内,乳腺癌是威胁女性健康的最主要恶性肿瘤,其高发病率对公共健康构成了严峻挑战。目前研究者们逐步揭露了线粒体在细胞代谢、能量转换及调控细胞命运中的核心作用。尤其在肿瘤学领域,线粒体功能的紊乱与癌症细胞的代谢重编程之间的密切联系逐渐成为研究焦点。线粒体不仅是细胞能量的主要来源,其在调控细胞存活、促进或抑制肿瘤进展中也扮演着关键角色。因此,针对线粒体功能相关基因在乳腺癌中的作用进行深入研究,不仅能够为理解乳腺癌的发病机制提供新的视角,同时也为预后评估和风险分层提供新的策略。基于此,本研究旨在通过构建一个基于线粒体相关基因的风险预测模型,探索其在改善乳腺癌患者预后预测及风险分层中的应用潜力,以期为乳腺癌的治疗和管理提供更为精确的分子标靶。 方法:本研究利用了两个权威数据库中的乳腺癌样本数据:癌症基因组图谱计划(TCGA)和国际乳腺癌分子分类联盟(METABRIC),分别充当训练集与验证集。通过深度分析这些样本的转录组数据,筛选出8个与乳腺癌预后紧密相关的线粒体功能相关基因,进而应用Lasso-Cox回归分析方法构建出一个新颖的风险评分模型。该模型的预后预测效能通过受试者操作特征(ROC)曲线进行了验证。为了增强预后预测的准确性,我们将风险评分模型与患者的临床特征结合,进一步开发了一个列线图(Nomogram)模型,并通过决策曲线分析和临床校准曲线验证了其预测准确性。随后我们进行功能富集和免疫浸润分析,深入探讨了不同风险组之间存在预后差异的可能原因,同时包括突变状态和药物敏感性的比较。最后我们补充了分子生物学实验,分析特定基因在乳腺癌细胞系中的作用。 结果:我们成功构建了一个包括ACSL1、ALDH2、MTHFD2、MRPL13、TP53AIP1、SLC1A1、ME3和BCL2A1这8个线粒体相关基因的风险评分模型。分析结果显示,该模型能够作为乳腺癌患者生存预后的独立风险预测因子(风险比HR = 3.028,95%置信区间2.038-4.499,P < 0.001)。通过进一步的分组分析,我们发现低风险组患者相比于高风险组,不仅展现了更优的生存预后,而且在免疫浸润水平、突变景观以及对抗肿瘤药物的敏感性等方面均表现出较为积极的趋势。而且高低风险组在肿瘤微环境和治疗反应性等方面也存在显著差异,为乳腺癌患者的个体化治疗提供了重要依据。最后我们通过分子生物学实验对模型中的部分基因进行了进一步验证,增加了基础实验水平的验证。 结论:综上所述,本研究开发的基于线粒体相关基因的风险模型不仅在乳腺癌患者的生存预后评估中展现出良好的应用潜力,而且对于患者的风险分层及后续治疗策略的制定具有一定的指导意义,同时也为未来乳腺癌的研究和治疗提供了新的线索。
第二部分:探索血浆蛋白对乳腺癌易感性:基于蛋白组学与转录组学的因果关联研究 中文摘要 背景:乳腺癌是全球女性中最常见的恶性肿瘤之一,其复杂的发病机制至今不完全明了,限制了治疗策略的发展。随着生物信息学技术的进步,全蛋白质组和全转录组关联研究(PWAS/TWAS)为探索乳腺癌的分子机制和发现新的生物标志物提供了可能。结合孟德尔随机化方法,本研究旨在从血浆蛋白的角度,深入分析其与乳腺癌之间的相关性及因果关系,为乳腺癌的预防、诊断和治疗提供新的视角和靶点。本研究旨在通过结合孟德尔随机化的蛋白质组和转录组关联研究,识别与乳腺癌相关且具有因果性的血浆蛋白。 方法:我们采用了一种创新的高通量数据分析策略,构建了一个包含两个主要阶段的分析框架,以在分子层面深入探讨乳腺癌的潜在生物标志物和治疗靶点。在第一阶段,我们运用了全蛋白质组和全转录组关联研究(P/TWAS),它能够全面扫描并识别在血浆样本中表达量与乳腺癌风险之间存在显著相关性的血浆蛋白。这一步骤是基于大规模人群的遗传和蛋白质表达数据,通过统计分析揭示哪些蛋白变化与乳腺癌的发生有关。在第二阶段,我们引入了孟德尔随机化方法,来测试特定暴露(如某血浆蛋白的表达水平)和疾病结果(如乳腺癌)之间因果关系。通过这种方法,我们能够区分那些仅与乳腺癌相关的蛋白质与那些可能直接导致乳腺癌发生的蛋白质。为了确保我们发现的因果关系的准确性和稳健性,我们采用外部队列验证以及多种敏感性分析:包括贝叶斯共定位分析,用于评估遗传变异对蛋白质表达和乳腺癌风险的影响是否共享相同的遗传位点;Steiger检测,以确认遗传工具的方向性是否与假设一致;异质性和多效性检测。这些分析方法加强了我们结果的可信度和应用价值。最后,我们对与乳腺癌风险显著相关的血浆蛋白进行了功能富集分析,更深入地理解这些蛋白在乳腺癌发生发展中的作用机制,并评估它们是否具备成为未来药物开发靶标的潜力。 结果:在我们的研究中,我们成功地鉴定了五种与乳腺癌有着强关联及因果联系的血浆蛋白。其中,PEX14(OR = 1.201,P = 0.016)和CTSF(OR = 1.114,P < 0.001)均显示出与乳腺癌的正向关联及因果关系,它们的表达水平升高与乳腺癌风险的增加有关。相反,SNUPN(OR = 0.905,P < 0.001)、CSK(OR = 0.962,P = 0.038)和PARK7(OR = 0.954,P < 0.001)则与乳腺癌发病风险减少相关,揭示了它们在抑制乳腺癌发展中可能的保护作用。在ER阳性乳腺癌中,CSK和CTSF的表达趋势与总体乳腺癌样本一致,而GDI2(OR = 0.920,P < 0.001)的则特异于此亚型。而在ER阴性亚型中,PEX14(OR = 1.645,P < 0.001)则是唯一的显著性关联血浆蛋白,不仅与乳腺癌有很强的相关性,其因果效应相比总体乳腺癌样本更为显著,强调了其在ER阴性乳腺癌中的重要性。这些关联均通过共定位和敏感性分析得到了进一步验证。 结论:通过综合分析蛋白质组学和转录组学数据,我们的研究成功鉴定了与乳腺癌有显著关联和因果关系的血浆蛋白,发现了潜在的生物标志物和治疗靶点。这些发现不仅丰富了乳腺癌研究领域,也为开发新的治疗方法和早期诊断技术提供了线索,展现了分子层面探究疾病的重要价值。
|
论文文摘(外文): |
Part Ⅰ:Prognosis prediction and risk stratification of breast cancer patients based on a mitochondria-related gene signature. Abstract Background: Globally, breast cancer has emerged as the foremost malignancy threatening women's health, presenting a formidable challenge to public health due to its high incidence rate. With the continuous advancements in technology, researchers have progressively unveiled the pivotal role of mitochondria in cellular metabolism, energy conversion, and the regulation of cell fate. Particularly in oncology, the intricate connection between mitochondrial dysfunction and cancer cell metabolic reprogramming has increasingly become a focal point of research. Mitochondria are not only the primary source of cellular energy but also play a key role in regulating cell life and death, and in promoting or inhibiting tumor progression. Therefore, an in-depth study of the role of mitochondrial function-related genes in breast cancer can provide new insights into the pathogenesis of the disease, as well as novel strategies for prognosis assessment and risk stratification. Against this backdrop, this study aims to construct a risk marker based on mitochondrial-related genes, exploring its potential in improving prognosis prediction and risk stratification among breast cancer patients, thereby offering more precise molecular targets for the treatment and management of breast cancer. Methods: This study utilized breast cancer sample data from two authoritative databases—the Cancer Genome Atlas (TCGA) and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC)—serving as the training set and independent validation set, respectively. Through deep analysis of the transcriptome data of these samples, eight mitochondrial function-related genes closely associated with breast cancer prognosis were identified, and a novel risk scoring model was subsequently constructed using the Lasso-Cox regression analysis method. The prognostic prediction efficacy of the model was validated through the receiver operating characteristic (ROC) curve. To enhance the accuracy of prognosis prediction, we combined the risk scoring model with patients' clinical features to further develop a nomogram prediction model, which was validated for its predictive accuracy through clinical calibration curves and decision curve analysis. Additionally, through functional enrichment and immune infiltration analysis, we delved into the differences in prognosis among different risk groups, including comparisons of mutation landscapes and drug sensitivity, aiming to comprehensively decipher the potential mechanisms and clinical application value of the risk marker from multiple perspectives. Results: Through in-depth research, we successfully constructed a risk scoring model comprising eight mitochondria-related genes: ACSL1, ALDH2, MTHFD2, MRPL13, TP53AIP1, SLC1A1, ME3, and BCL2A1. Statistical analysis showed that the model could serve as an independent risk predictor of survival prognosis for breast cancer patients, demonstrating significant predictive value (Hazard Ratio HR = 3.028, 95% Confidence Interval 2.038-4.499, P < 0.001). Further subgroup analysis revealed that patients in the low-risk group not only exhibited better survival prognosis but also showed more positive trends in immune infiltration levels, mutation landscapes, and sensitivity to anti-tumor drugs compared to those in the high-risk group. These results indicate significant differences between high and low-risk groups in terms of tumor microenvironment and treatment responsiveness, providing an important basis for personalized treatment of breast cancer patients. Conclusion: In summary, the risk marker based on mitochondrial-related genes developed in this study not only demonstrates great potential in the assessment of survival prognosis for breast cancer patients but also plays a significant role in guiding risk stratification and subsequent treatment strategies. The application of this risk marker can help improve the treatment outcomes and quality of life for breast cancer patients, while also offering new clues for future research and treatment of breast cancer.
Part II:Elucidating the susceptibility to breast cancer: an in-depth proteomic and transcriptomic investigation into novel potential plasma protein biomarkers. Abstract Background: Breast cancer is one of the most common malignancies among women globally, with its complex etiology not fully understood, limiting the development of therapeutic strategies. Advances in bioinformatics, particularly in proteome-wide and transcriptome-wide association studies (PWAS/TWAS), offer possibilities for exploring the molecular mechanisms of breast cancer and identifying new biomarkers. Utilizing Mendelian randomization, this study aims to delve into the correlations and causal relationships between plasma proteins and breast cancer from a plasma protein perspective, providing new insights and targets for the prevention, diagnosis, and treatment of breast cancer. This research identifies plasma proteins related to and causally associated with breast cancer through an extensive proteome and transcriptome-wide association study combined with Mendelian randomization. Methods: We employed an innovative high-throughput data analysis strategy, constructing a two-phase analysis framework to explore potential biomarkers and therapeutic targets for breast cancer at the molecular level. In the first phase, we utilized PWAS/TWAS, a cutting-edge bioinformatics approach, to systematically scan and identify proteins in plasma samples that have significant correlations with breast cancer risk. This step, based on large-scale genetic and protein expression data from populations, reveals protein variations associated with breast cancer incidence through statistical analysis. In the second phase, Mendelian randomization, a statistical technique that uses genetic variations as instrumental variables to test the causal relationship between specific exposures (such as protein expression levels) and disease outcomes (such as breast cancer), was introduced. This method allows us to differentiate proteins merely associated with breast cancer from those that might directly cause the disease. To ensure the accuracy and robustness of our causal findings, we further employed multiple sensitivity analyses, including Bayesian colocalization analysis, Steiger filtering, and tests for heterogeneity and pleiotropy, thereby strengthening the credibility and applicability of our results. Finally, we conducted a functional enrichment analysis on plasma proteins significantly associated with and causative of breast cancer risk. This analysis enabled us to understand the mechanisms of these proteins in the development of breast cancer more deeply and assess their potential as targets for future drug development. Results: Our study successfully identified five plasma proteins strongly associated and causally linked with breast cancer. Specifically, PEX14 (OR = 1.201, P = 0.016) and CTSF (OR = 1.114, P < 0.001) both showed positive associations and causal relationships with breast cancer, with their increased expression levels linked to higher breast cancer risk. Conversely, SNUPN (OR = 0.905, P < 0.001), CSK (OR = 0.962, P = 0.038), and PARK7 (OR = 0.954, P < 0.001) were associated with a reduced risk of breast cancer, revealing their potential protective roles in inhibiting breast cancer development. Further analysis also found that these proteins exhibited different association patterns for various breast cancer subtypes. In estrogen receptor (ER) positive breast cancer, CSK and CTSF expression trends were consistent with the overall breast cancer samples, while GDI2 (OR = 0.920, P < 0.001) was specific to this subtype. For the ER-negative subtype, PEX14 (OR = 1.645, P < 0.001) was the only significant plasma protein, not only strongly associated with breast cancer but also showing a more significant causal effect compared to the overall breast cancer samples, emphasizing its importance in ER-negative breast cancer. These associations were further validated through colocalization and sensitivity analyses. Conclusion: By integrating genetics, proteomics, and transcriptomics data, our study successfully identified plasma proteins significantly associated and causally related to breast cancer, offering new insights into the molecular mechanisms of breast cancer and identifying potential biomarkers and therapeutic targets. These findings not only enrich the field of breast cancer research but also lay the foundation for developing new treatment methods and early diagnostic techniques, demonstrating the critical value of exploring diseases at the molecular level.
|
开放日期: | 2024-06-09 |