[1]高维东,林 琳,刘贤梅,等.基于特征融合和注意力机制的物体 6D 姿态估计算法[J].计算机技术与发展,2023,33(12):92-100.[doi:10. 3969 / j. issn. 1673-629X. 2023. 12. 013]
 GAO Wei-dong,LIN Lin,LIU Xian-mei,et al.Object 6D Pose Estimation Algorithm Based on Feature Fusion and Attention Mechanism[J].,2023,33(12):92-100.[doi:10. 3969 / j. issn. 1673-629X. 2023. 12. 013]
点击复制

基于特征融合和注意力机制的物体 6D 姿态估计算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年12期
页码:
92-100
栏目:
媒体计算
出版日期:
2023-12-10

文章信息/Info

Title:
Object 6D Pose Estimation Algorithm Based on Feature Fusion and Attention Mechanism
文章编号:
1673-629X(2023)12-0092-09
作者:
高维东林 琳刘贤梅赵 娅
东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
Author(s):
GAO Wei-dongLIN LinLIU Xian-meiZHAO Ya
School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China
关键词:
物体 6D 姿态估计深度学习特征融合注意力机制跳跃连接
Keywords:
object 6D pose estimationdeep learningfeature fusionattention mechanismskip connection
分类号:
TP391. 4
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 12. 013
摘要:
针对物体 6D 姿态估计易受目标物体的弱纹理和小体积特性、复杂背景、遮挡的影响,提出一种结合特征融合和注意力机制的物体 6D 姿态估计算法。 首先,在 RGB 图像特征提取网络的
首个卷积块中加入卷积注意力模块,提升弱纹理小物体的区域显著度;其次,在基于编解码结构的 RGB 图像特征提取网络中引入基于卷积注意力模块的跳跃连接,有效地将编码阶段的颜色、
纹理等细节外观特征融合到解码阶段的姿态语义特征中,弥补姿态语义特征缺乏细节外观特征的问题;然后,使用通道注意力模块改进池化金字塔模块,增强目标物体可见区域与遮挡区域的
联系,提升遮挡鲁棒性;最后,使用卷积注意力模块重构解码阶段输出的姿态语义特征,增强相似表面特征的区分度,从而降低外观相似物体对物体 6D姿态估计的干扰。 实验结果表明,该算法
在 Occlusion LINEMOD 数据集和 LINEMOD 数据集上 ADD( -S) 指标分别达到73. 4% 和 99. 8% ,与 FFB6D 相比,分别提升 7. 8 百分点和 0. 1 百分点,验证了该算法的可行性。
Abstract:
Object 6D pose estimation is easily affected by the weak texture and small volume characteristics of the target object,complexbackground,and occlusion. To solve the above problems,
an object 6D pose estimation algorithm combining feature fusion and attentionmechanism is proposed. First of all, the Convolutional Block Attention Module is added to the first convol-ution module of the RGBimage feature extraction network to improve the regional saliency of small objects with weak texture. Secondly,the skip connection basedon Convolutional Block Attention Module is introduced into the RGB image feature extraction network based on the encoder - decoderstructure,which effectively fuses the detailed appearance features contai-ning color,texture and others in the coding stage into the posesemantic features in the decoding stage to make up for the lack of detailed appearance features in the pose semantic features. Then,theChannel Attention Module is used to improve the Pyramid Pooling Module to enhance the connection between the visible area of thetarget object and the occluded area,and improve the occlusion robustness. Finally,the Convolutional Block Attention Module is used toreconstruct the features in the decoding stage rich in pose semantic information,so as to enhance the discrimination of similar surfacefeatures,thus reducing the interference of similar appearance objects on object 6D pose estimation. The experimental results show that theADD( -S) index of the algorithm on Occlusion LINEMOD dataset and LINEMOD dataset reaches 73. 4% and 99. 8% respectively,which are 7. 8 percentage points and 0. 1 percentage points higher than that of FFB6D respectively,verifying the feasibility of the algorithm.

相似文献/References:

[1]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
 CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(12):19.
[2]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
 HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(12):1.
[3]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
 ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(12):1.
[4]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
 LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(12):7.
[5]曲之琳,胡晓飞.基于改进激活函数的卷积神经网络研究[J].计算机技术与发展,2017,27(12):77.[doi:10.3969/ j. issn.1673-629X.2017.12.017]
 QU Zhi-lin,HU Xiao-fei.Research on Convolutional Neural Network Based on Improved Activation Function[J].,2017,27(12):77.[doi:10.3969/ j. issn.1673-629X.2017.12.017]
[6]吴 超,邵 曦.基于深度学习的指静脉识别研究[J].计算机技术与发展,2018,28(02):200.[doi:10.3969/j.issn.1673-629X.2018.02.043]
 WU Chao,SHAO Xi.Research on Finger Vein Recognition Based on Deep Learning[J].,2018,28(12):200.[doi:10.3969/j.issn.1673-629X.2018.02.043]
[7]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(12):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[8]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
 SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(12):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
[9]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(12):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[10]王嘉文,王传栋,杨雁莹.一种中文人名识别的训练架构[J].计算机技术与发展,2018,28(07):53.[doi:10.3969/ j. issn.1673-629X.2018.07.012]
 WANG Jia-wen,WANG Chuan-dong,YANG Yan-ying.A Training Framework for Chinese Name Recognition[J].,2018,28(12):53.[doi:10.3969/ j. issn.1673-629X.2018.07.012]

更新日期/Last Update: 2023-12-10