论文题名(中文): | 拷贝数变异和DNA甲基化驱动基因构建肝癌预后、复发和诊断模型 |
姓名: | |
论文语种: | chi |
学位: | 博士 |
学位类型: | 专业学位 |
学校: | 北京协和医学院 |
院系: | |
专业: | |
指导教师姓名: | |
校内导师组成员姓名(逗号分隔): | |
论文完成日期: | 2021-04-15 |
论文题名(外文): | Construction of models for prognosis, recurrence and diagnosis of hepatocellular carcinoma using copy number variation- and DNA methylation-driven genes |
关键词(中文): | |
关键词(外文): | hepatocellular carcinoma copy number variation DNA methylation driver genes |
论文文摘(中文): |
目的:肝细胞癌(hepatocellular carcinoma,HCC)的发生包括一系列遗传和表观遗传学的改变。我们对肝癌的基因表达数据和拷贝数变异(copy number variation,CNV)数据、DNA甲基化数据进行联合分析,建立肝癌的预后、复发和诊断模型,以期服务于临床。 方法:我们通过对肝癌的基因表达数据、CNV数据和DNA甲基化数据进行了综合分析,鉴定肝癌中的CNV驱动基因和DNA甲基化驱动的基因。结合患者临床信息,将这些基因进行单因素Cox回归分析,LASSO(least absolute shrinkage and selection operator)回归分析,以及多因素Cox回归分析建立了预后模型。同时,我们还使用DNA甲基化驱动基因构建了肝癌的复发和诊断模型。所有的模型都经过外部验证。 结果:我们共鉴定了肝癌中568个CNV驱动基因和123个DNA甲基化驱动基因。单因素Cox回归分析后,筛选了63个与生存相关的CNV驱动基因。通过LASSO和多因素Cox分析,最终构建了含8个CNV驱动基因的预后风险模型。相比低风险组,高风险组的肝癌患者总生存期(overall survival,OS)明显缩短[hazard ratio (HR) = 6.14;P < 0.001]。进一步分析发现肿瘤浸润性中性粒细胞数目与风险评分呈正相关。高风险组患者具有更高的免疫检查点基因表达水平。通过DNA甲基化联合基因表达分析,共鉴定了123个DNA甲基化驱动基因。从DNA甲基化驱动基因中筛选出两个基因(SPP1和LCAT)来构建预后模型。在训练集(HR = 2.81; P<0.001)和验证集(HR=3.06; P<0.001)中,与低风险组相比,高风险组的预后明显不良。多因素Cox回归分析表明,预后模型是HCC预后的独立预测因子(P<0.05)。此外,在训练集(HR=2.22; P<0.001)和验证集(HR=2; P<0.01)中,复发模型显著区分高风险和低风险组之间的HCC复发率。并且由这两个基因构成的两个诊断模型提供了很高的准确度,可将肝癌与正常样本和异型增生结节区别开。 结论:我们鉴定了肝癌中的CNV和DNA甲基化驱动的基因,构建并验证了由CNV和DNA甲基化驱动基因组成的预后,复发和诊断模型。通过整合多维基因组数据获得的结果为肝癌生物标记物提供了新的研究方向,为肝癌患者的个体化治疗提供了新的可能性。 |
论文文摘(外文): |
Aim:The occurrence of hepatocellular carcinoma (HCC) includes a series of genetic and epigenetic changes. We conducted a comprehensive analysis of gene expression data, copy number variation (CNV) data, and DNA methylation data of liver cancer,aiming to establish prognosis, recurrence and diagnosis models for HCC. We hope the results could be translated into clinical practice. Methods:We performed a comprehensive analysis of the gene expression data, CNV data and DNA methylation data of liver cancer to identify CNV-driven genes and DNA methylation-driven genes in liver cancer. The clinical information of patients was collected, and these genes were subjected to univariate Cox regression analysis, LASSO (least absolute shrinkage and selection operator) regression analysis, and multivariate Cox regression analysis to establish a prognostic model. At the same time, we also used DNA methylation driver genes to construct a recurrence and diagnosis model of liver cancer, respectively. All models have been externally verified. Results: After integrative analysis of CNVs and corresponding mRNA expression profiles, 568 CNV-driven genes were identified. Sixty-three CNV-driven genes were found to be markedly associated with overall survival (OS) after univariate Cox regression analysis. After LASSO and multivariate Cox regression analysis,eight CNV-driven genes were screened to generate a prognostic risk model. Compared with low-risk group, the OS of patients in the high-risk group was significantly shorter in both the TCGA [hazard ratio (HR) = 6.14, 95% confidence interval (CI): 2.72-13.86, P < 0.001] and ICGC (HR = 3.23, 95%CI: 1.17-8.92, P < 0.001) datasets. Further analysis revealed the infiltrating neutrophils were positively correlated with risk score. Meanwhile, the high-risk group was associated with higher expression of immune checkpoint genes. After integrative analysis of DNA methylation and corresponding gene expression profile, a total of 123 DNA methylation-driven genes were identified. Two of these genes (SPP1 and LCAT) were chosen to construct the prognostic model. The high-risk group showed a markedly unfavorable prognosis compared to the low-risk group in both training (HR = 2.81; P < 0.001) and validation (HR =3.06; P < 0.001) datasets. Multivariate Cox regression analysis indicated the prognostic model to be an independent predictor of prognosis (P < 0.05). Also, the recurrence model successfully distinguished the HCC recurrence rate between the high-risk and low-risk groups in both training (HR = 2.22; P < 0.001) and validation (HR = 2; P < 0.01) datasets. The two diagnostic models provided high accuracy for distinguishing HCC from normal samples and dysplastic nodules in the training and validation datasets, respectively. Conclusions:We identified CNV and DNA methylation-driven genes for HCC, and further constructed and validated prognostic, recurrence, and diagnostic models based on these driver genes. The results obtained by integrating multidimensional genomic data offer novel research directions for HCC biomarkers and new possibilities for individualized treatment of patients with HCC. |
开放日期: | 2021-06-18 |