论文题名(中文): | 人工智能技术在荧光腹腔镜胆囊切除手术中去除肝脏荧光污染的探索 |
姓名: | |
论文语种: | chi |
学位: | 硕士 |
学位类型: | 学术学位 |
学校: | 北京协和医学院 |
院系: | |
专业: | |
指导教师姓名: | |
论文完成日期: | 2025-02-16 |
论文题名(外文): | Exploration of Artificial Intelligence Techniques for Removing Hepatic Fluorescence Contamination During Fluorescence-Guided Laparoscopic Cholecystectomy |
关键词(中文): | |
关键词(外文): | Artificial Intelligence Deep Learning Multimodal Segmentation Laparoscopic Cholecystectomy |
论文文摘(中文): |
胆石症是常见的消化道系统疾病,全球都有很高的患病率。胆石症⼿术治疗的“金标准” 是腹腔镜下胆囊切除术(LC),然而由于腹腔镜下视野受限,导致术者对胆管的解剖走行不了 解,⼿术并发症,特别是胆管损伤(BDI)的发生率反而较传统开腹⼿术有所上升。为了降低 BDI 发生率,提出了达到关键安全视野(CVS)⽅法和(ICGLC),即以吲哚菁绿(ICG)为介质的术中实 时近红外荧光胆管造影技术 NIRF,但是在 ICGLC 场景下术者常被大片肝脏荧光影响视野, 增加术者视觉压力,从而影响⼿术效果。人工智能技术 AIM 近期飞速发展,在医学领域取 得了许多成果,包括在外科⼿术术中影像的探究,然而大多都是对白光单模态的研究,目前 尚⽆结合 ICGLC 影像对荧光、白光多模态的研究尝试。本文首次基于深度学习技术进行多 模态分析,构建了 Deeplabv3+mid fusion 模型⽅案,对 ICGLC 中肝脏荧光信号进行了识别 并屏蔽的尝试,在一条崭新的道路上踏出了第一步。 本研究收集了 2022 年 6 月⾄ 2023 年 3 月间北京协和医院单中心的 ICG 荧光腹腔镜下 胆囊切除术 ICGLC ⼿术视频共 76 段,经筛选后留下 33 名患者的⼿术视频。随后将肝脏和 胆管都出现的片段进行剪辑,按 8:1 抽帧形成图像集,再把白光、荧光通道按 RGB 通道进 行融合,融合后把图像集按照完全对齐、不完全一致、存在错位进行筛选过滤,并以此将实 验分为三阶段,接着研究人标记出所有图像集中的肝脏轮廓进行模型训练。前两个阶段将训 练集依照病例划分,第三阶段按照 3:1 比例将存在错位的图像集分别加⼊训练集和测试集。 第一阶段中,将完全对齐的图像集按照 RGB 通道融合前后分为白光组和多模态组,进行多 模态和单白光的效果对比,然后进行多个热门模型对比,找到最佳模型,第二阶段中,加⼊ 不一致的图像进行数据增量,并对通道融合结构探究,找到最优模型和多模态融合⽅案。在 最后阶段,引⼊存在错位的数据,验证我们⽅案的可行性。所有实验结果都⽤计算机视觉的 评价指标,包括精准率、召回率、Dice 系数进行评价,最终得到的模型的实验结果进行荧 光污染比值比较和临床医师的主观评分。 第一阶段实验结果显示,对 ICGLC 的肝脏荧光的识别,多模态分割比单白光分割效果 更好,而 Deeplabv3 是多模态优选算法模型。第二阶段实验结果显示,mid fusion 结构是多 模态的优选⽅案,刚引⼊不一致数据后模型识别能力稍降,而在数据增量后模型性能提升。 实验的最后阶段结果验证了前两个阶段得到的 Deeplabv3+mid fusion ⽅案是对 ICGLC 中肝 脏荧光多模态分割的首选⽅案,以 0.863 的召回率与 0.912 的精准率识别了肝脏荧光并屏蔽。 荧光污染比值的比较体现了模型处理前后肝脏荧光污染去除的客观效果优秀。3 名医师在观 看加工后的视频后给出了均分超过 8.5(满分 10 分)的主观评分。 最后我们得出结论,本文提出的 Deeplabv3+mid fusion ⽅案能有效识别 ICGLC 中肝脏 荧光并屏蔽,减缓术者视觉疲劳,未来或能提高⼿术安全性。⽅法学来说本研究首次尝试了 多模态深度学习分割⽅式,为将来更多的多模态研究当作铺垫。 |
论文文摘(外文): |
Cholelithiasis is a common gastrointestinal disease with a high global prevalence. The current gold standard for surgical treatment is laparoscopic cholecystectomy (LC). However, due to the limited visual field in laparoscopic procedures, surgeons may encounter difficulty in accurately identifying the anatomical course of the bile ducts, resulting in a higher incidence of surgical complications—particularly bile duct injury (BDI)—compared to traditional open surgery. To reduce the risk of BDI, the Critical View of Safety (CVS) concept was proposed, along with indocyanine green fluorescence-guided laparoscopic cholecystectomy (ICGLC) using near-infrared fluorescence (NIRF) cholangiography as an intraoperative imaging tool. However, in ICGLC procedures, surgeons are often disturbed by excessive hepatic fluorescence, which obscures the surgical field, increases visual fatigue, and may affect surgical precision and outcomes. In recent years, artificial intelligence in medicine (AIM) has developed rapidly, achieving promising results across various domains, including intraoperative image analysis. However, current research has largely focused on single-modality white light imaging, and to date, no studies have explored the integration of ICGLC-specific multimodal imaging (white light + fluorescence) for intraoperative scene understanding. This study is the first to apply deep learning-based multimodal analysis to ICGLC video data, proposing a novel segmentation framework based on the Deeplabv3+ mid-fusion model to detect and suppress hepatic fluorescence signals—marking an initial and exploratory step into this innovative research direction. This study retrospectively collected 76 intraoperative ICGLC videos performed at Peking Union Medical College Hospital between June 2022 and March 2023. After screening, 33 patients ’ videos were selected. Video segments simultaneously showing liver and biliary structures were manually clipped and extracted at an 8:1 frame rate, generating a dataset of surgical images. White light and fluorescence images were fused channel-wise into RGB format. The resulting image dataset was categorized into perfectly aligned, partially inconsistent, and spatially misaligned groups, forming the basis for a three-stage experimental framework. Liver contours in all images were manually annotated by trained surgical researchers for model training. In the first two stages, datasets were split by patient; in the third stage, the misaligned images were randomly divided into training and test sets at a 3:1 ratio. In the first stage, the perfectly aligned dataset was used to compare single-modality (white light only) segmentation against multimodal segmentation using channel-fused input. Several state-of-the-art deep learning models were compared, and Deeplabv3 was identified as optimal architecture. In the second stage, we introduced inconsistent image pairs to expand the training dataset and evaluated various fusion strategies. Among them, mid-fusion demonstrated superior performance in multimodal feature integration. Although model performance slightly decreased with inconsistent data introduction, it improved significantly with increased data volume. In the final stage, spatially misaligned data were introduced to validate the robustness of the proposed method under real-world, non-ideal conditions. All models were evaluated using standard computer vision metrics, including precision, recall, and Dice coefficient, and fluorescence contamination before and after model processing was quantified using the Fluorescence Contamination Ratio (FCR). Additionally, clinical evaluation was performed based on subjective scoring from experienced surgeons. Experimental results demonstrate that in the context of hepatic fluorescence recognition in ICGLC, multimodal segmentation outperformed white-light-only models, with Deeplabv3 emerging as the most suitable algorithm. In the second stage, mid-fusion was confirmed as the optimal fusion strategy. Although performance initially declined with inconsistent data, it improved significantly after dataset augmentation. In the final stage, the proposed Deeplabv3+ mid-fusion model achieved a recall of 0.863 and a precision of 0.912 in identifying and suppressing hepatic fluorescence. The FCR analysis demonstrated a marked reduction in background fluorescence post-processing, confirming the objective effectiveness of the method. Subjective evaluation by three surgeons yielded average scores above 8.5 out of 10, indicating strong clinical acceptance. In conclusion, the proposed Deeplabv3+ mid-fusion model effectively identifies and suppresses hepatic fluorescence in ICGLC, alleviates visual fatigue for surgeons, and holds promise for improving surgical safety. Methodologically, this study represents the first application of multimodal deep learning segmentation in this domain and lays a foundational framework for future multimodal AI research in fluorescence-guided surgery. |
开放日期: | 2025-06-03 |