[1]姜世浩,齐苏敏,王来花,等.基于 Mask R-CNN 和多特征融合的实例分割[J].计算机技术与发展,2020,30(09):65-70.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 012]
 JIANG Shi-hao,QI Su-min,WANG Lai-hua,et al.Instance Segmentation Modal Based on Mask R-CNN and Multi-feature Fusion[J].,2020,30(09):65-70.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 012]
点击复制

基于 Mask R-CNN 和多特征融合的实例分割()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
30
期数:
2020年09期
页码:
65-70
栏目:
智能、算法、系统工程
出版日期:
2020-09-10

文章信息/Info

Title:
Instance Segmentation Modal Based on Mask R-CNN and Multi-feature Fusion
文章编号:
1673-629X(2020)09-0065-06
作者:
姜世浩齐苏敏王来花贾 惠
曲阜师范大学 软件学院,山东 曲阜 273165
Author(s):
JIANG Shi-haoQI Su-minWANG Lai-huaJIA Hui
School of Software Engineering,Qufu Normal University,Qufu 273165,China
关键词:
实例分割深度学习Mask R-CNN全卷积网络特征融合
Keywords:
instance segmentationdeep learningMask R-CNNfully convolution networksfeature fusion
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2020. 09. 012
摘要:
为了能够充分地利用图像特征信息,提升实例分割的效果,提出了一种基于 Mask R-CNN 网络结构和多特征融合的实例分割模型。 首先,在 Mask R-CNN 模型的基础上引入两条分支:一条基于整体嵌套边缘检测(HED)模型的边缘检测分支生成偏重于边缘信息的边缘特征图,一条基于全卷积网络(FCN)的语义分割分支生成偏重于空间位置信息的语义特征图。 然后,在进行感兴趣区域对齐(ROIAlign)时,为了充分利用特征金字塔的各层信息,将感兴趣区域(ROI)同时映射到相应的金字塔层及其相邻层。 最后,融合以上得到的多个特征图,生成信息更加丰富的新特征用于后续的检测和分割任务。 实验结果表明,该方法有效提高了检测和分割的准确性。 在使用 Resnet50-FPN 作为骨干网络且没有附加条件的情况下,与 Mask R-CNN 相比,该模型的检测和分割平均精度(mAP)分别提升了 1.2% 和 1.0% 。
Abstract:
To fully utilize image features to improve the effect of instance segmentation,an instance segmentation model based on Mask R-CNN network structure and multi-feature fusion scheme is proposed. Firstly,two branches are introduced on the basis of Mask RCNN. One is an edge detection branch based on holistically-nested edge detection (HED) model to generate edge feature graph with emphasis on edge information, the other is a semantic segmentation branch based on fully convolution network (FCN) to generate semantic feature graph with emphasis on rich spatial location information. Secondly,when performing ROIAlign, regions of interest (ROI) are mapped to the corresponding pyramid layer and its adjacent layers to make full use of the information of each layer of the feature pyramid. Finally,the above multiple feature graphs are fused,and the new features with richer information can be generated for subsequent detection and segmentation tasks. Experiment shows that the proposed method effectively improves the accuracy of detection and segmentation. With Resnet50-FPN as the backbone network and no bells and whistles,the box AP is increased by 1.2% and the mask AP is increased by 1.0% compared to Mask R-CNN.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(09):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
 SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(09):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
 HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(09):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
 CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(09):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(09):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
 HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(09):1.
[7]徐 融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
 XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(09):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
 ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(09):1.
[9]李全兵,文 钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
 LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(09):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
 LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(09):7.

更新日期/Last Update: 2020-09-10