论文题名(中文): | 基于千万细胞图谱的泛血液细胞注释工具的开 发与应用 |
姓名: | |
论文语种: | chi |
学位: | 硕士 |
学位类型: | 学术学位 |
学校: | 北京协和医学院 |
院系: | |
专业: | |
指导教师姓名: | |
校内导师组成员姓名(逗号分隔): | |
论文完成日期: | 2025-04-08 |
论文题名(外文): | Development and application of a PAN-Blood Data Annotator with a 10-Million Single-Cell Atlas |
关键词(中文): | |
关键词(外文): | Single-cell RNA sequencing Immunology Blood cells Single-cell atlas Cell type annotation |
论文文摘(中文): |
研究目的: 单细胞 RNA 测序技术的快速进展彻底改变了细胞异质性的研究方式。然而,由于免疫细胞种类繁多、功能复杂,细胞类型的精准注释仍面临重大挑战。本研究旨在构建高质量的血液单细胞参考图谱,并开发自动化注释工具,提升免疫细胞类型识别的准确性与通用性。 研究方法: 本研究整合了来自 16 项公开研究的单细胞转录组数据,涵盖不同的生理、衰老及疾病状态。通过严格的数据清洗、质量控制、数据预处理及去批次效应数据整合。并通过两轮细胞聚类将参考图谱的细胞类型注释分为两个层级,第二层级的细胞聚类使用 CellHint 进行纯化以确保注释结果的准确性与细胞亚型的纯度。 研究结果: 构建了一个千万级单细胞血液细胞参考图谱。该图谱采用分层结构组织,包含16 个大类、54 个小类、611 个高级细胞簇(pmid_cts)与 4460 个低级细胞簇(pd_cc_cl_tfs)。基于该千万级单细胞图谱开发了泛血液单细胞数据自动化注释工具(PAN-blood single-cell Data Annotator, scPANDA)来实现精确的细胞类型注释。使用该工具在多种独立外部免疫单细胞数据集中实现高分辨率的自动注释;在肾癌数据中识别血液-肿瘤共存的细胞簇;以及在小鼠(Mus musculus)和食蟹猴(Macaca fascicularis)中成功识别与人类保守的细胞亚群,验证了其跨物种映射能力和泛化能力。 研究结论: scPANDA 凭借构建的大规模、高质量的血液单细胞图谱,实现了准确、稳健的细胞类型注释,代表了一种高效的参考映射方法。本工具不仅提升了血液单细胞数据分析的效率与可信度,也为复杂组织环境中免疫细胞亚型的精准识别及跨物种比较提供了新思路和实用工具,具有广泛的科研应用前景。scPANDA 在多个应用场景中展现了优越性能。 |
论文文摘(外文): |
Objective: The rapid advancement of single-cell RNA sequencing (scRNA-seq) technology has revolutionized the study of cellular heterogeneity, particularly within the hematological system. However, due to the vast diversity and functional complexity of immune cells, accurate cell-type annotation remains a major challenge. This study aims to construct a high-quality reference atlas for blood single cells and to develop an automated annotation tool to enhance the accuracy and universality of immune cell identification. Methods: We integrated scRNA-seq data from 16 publicly available studies encompassing various physiological, aging, and disease conditions. Rigorous data filtering, quality control, preprocessing, and batch effect correction were performed. Cell type annotations in the reference atlas were structured into two hierarchical levels through two rounds of clustering, with the second-level clusters purified using CellHint to ensure annotation accuracy and cell subtype purity. Results: We constructed a large-scale reference atlas consisting of over ten million blood cells, organized in a hierarchical structure comprising 16 compartments, 54 classes, 611 high-level clusters (pmid_cts), and 4,460 low-level clusters (pd_cc_cl_tfs). Based on this atlas, we developed PAN-blood single-cell Data Annotator (scPANDA), an automated annotation tool for accurate cell-type classification. scPANDA achieved high-resolution annotation in multiple independent immune scRNA-seq datasets, successfully identified blood-tumor coexisting clusters in renal cell carcinoma, and demonstrated robust cross-species mapping capabilities by identifying conserved immune subpopulations in Mus musculus and Macaca fascicularis. Conclusion: Leveraging a large-scale, high-quality blood single-cell atlas, scPANDA enables accurate and robust cell-type annotation, representing an efficient reference mapping strategy. The tool improves the reliability and efficiency of immune scRNA-seq data analysis and provides a novel and practical approach for precise immune subtype identification in complex tissue environments and cross-species comparisons. scPANDA has demonstrated superior performance across diverse application scenarios and holds great promise for broad research utility. |
开放日期: | 2025-06-11 |