«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.ISSN1673-629X.2024.0302]
点击复制

基于多尺度空间Transformer的肝脏分割方法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:: 2025年02期

页码:: 1-8

栏目:: 媒体计算

出版日期:: 2025-02-10

文章信息/Info

Title:: Liver Segmentation Method Based on Multi-scale Spatial Transformer

文章编号:: 1673-629X(2025)02-0001-08

作者:: 丁厚林1; 2; 3; 张晓龙1; 2; 3; 林晓丽1; 2; 3; 邓鹤1; 2; 3; 任宏伟4; 1. 武汉科技大学计算机科学与技术学院,湖北武汉 430065;
2. 武汉科技大学大数据科学与工程研究院,湖北武汉 430065;
3. 武汉科技大学智能信息处理与实时工业系统湖北省重点实验室,湖北武汉 430065;
4. 武汉科技大学附属天佑医院,湖北武汉 430064

Author(s):: DING Hou-lin1; 2; 3; ZHANG Xiao-long1; 2; 3; LIN Xiao-li1; 2; 3; DENG He1; 2; 3; REN Hong-wei4; 1. School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China;
2. Institute of Big Data Science and Engineering,Wuhan University of Science and Technology,Wuhan 430065,China;
3. Hubei Key Laboratory of Intelligent Information Processing and Real-time Industrial System,Wuhan University of Science and Technology,Wuhan 430065,China;
4. Tianyou Hospital Affiliated to Wuhan University of Science and Technology,Wuhan 430064,China

关键词:: 三维肝脏影像分割; 深度学习; 交叉自注意机制; 多尺度空间Transformer; 多尺度特征融合

Keywords:: 3D liver image segmentation; deep learning; cross self-attention mechanism; multi-scale spatial Transformer; multi-scale feature fusion

分类号:: TP391

DOI:: 10.20165/j.cnki.ISSN1673-629X.2024.0302

摘要:: 肝脏器官尺度多样且与周围器官高度相似,很难从腹部计算机影像中准确分割出肝脏区域,现有的很多方法将 CNN 和 Transformer 相结合以得到图像局部和全局特征依赖关系,从而取得了更好的性能。然而,简单的组合方法忽视了图像分割中多尺度特征融合和注意力机制的重要性,没有很好地解决肝脏分割问题。该文提出了一种用于肝脏分割的多尺度空间 Transformer 与交叉自注意机制的三维肝脏影像分割方法。该方法首先采用 CNN 和 Transformer 相结合的方式逐步提取不同尺度的特征信息使网络对肝脏及其周围组织的识别更加准确;接着利用多尺度空间 Transformer 对不同层次和尺度特征的图像在空间维度上融合,提高了网络对肝脏边缘的定位能力;最后在解码器中设计了交叉自注意引导融合模块减少噪声等不相关信息带来的干扰,提高分割质量。在 LiTS、CHAOS、Sliver07 和某医院 MRI 数据集上进行了对比和消融实验,实验结果表明,该方法相较于当前的主流网络具有更好的分割性能和临床应用前景。

Abstract:: The liver organs have diverse scales and are highly similar to surrounding organs,making it difficult to accurately segment the liver region from abdominal computer images. Many existing methods combine CNN and Transformer to obtain local and global feature dependencies of the image,achieving better performance. However,simple combination methods have overlooked the importance of multi-scale feature fusion and attention mechanisms in image segmentation,and have not effectively solved the problem of liver segmentation.We propose a 3D liver image segmentation method using multi-scale spatial Transformer and cross self-attention mechanism for liver segmentation. The method first uses a combination of CNN and Transformer to gradually extract feature information of different scales,making the network’s recognition of the liver and its surrounding tissues more accurate. Then,the multi-scale spatial Transformer is used to fuse images with different levels and scales in the spatial dimension,improving the network’s ability to locate the edges of the liver. Fi-nally,a cross self attention guided fusion module is designed in the decoder to reduce interference caused by irrelevant information such as noise and improve segmentation quality. The proposed method is compared and subjected to ablation experiments on LiTS,CHAOS,Sliver07,and a hospital MRI dataset. The experimental results show that the proposed method has higher segmentation performance and clinical application prospects compared to current mainstream networks.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(02):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
　SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(02):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
　HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(02):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
　CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(02):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(02):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
　HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(02):1.
[7]徐融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
　XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(02):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
　ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(02):1.
[9]李全兵,文钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
　LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(02):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
　LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(02):7.

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed193
全文下载/Downloads171
评论/Comments