«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.ISSN1673-629X.2024.0139]
点击复制

基于协调注意力机制的轻量级YOLOv4零件检测

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 34
期数:: 2024年08期

页码:: 23-29

栏目:: 媒体计算

出版日期:: 2024-08-10

文章信息/Info

Title:: Part Detection Based on Coordinated Attention Mechanism Lightweight YOLOv4

文章编号:: 1673-629X(2024)08-0023-07

作者:: 朱文博; 陈龙飞; 余琦; 上海理工大学机械工程学院,上海 200093

Author(s):: ZHU Wen-bo; CHEN Long-fei; YU Qi; School of Mechanical Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China

关键词:: 深度学习; 协调注意力机制; 零件检测; YOLOv4网络; MobileNeXt网络

Keywords:: deep learning; coordinating attention mechanism; part detection; YOLOv4 network; MobileNeXt network

分类号:: TP391.4

DOI:: 10.20165/j.cnki.ISSN1673-629X.2024.0139

摘要:: 针对零件自动检测任务在复杂工况下,如零件堆叠粘连、有杂物干扰等,存在实时性差、硬件资源占用大等问题,提出一种基于轻量级 YOLOv4 网络的零件检测方法。采用 MobileNeXt 代替 CSPDarkNet53 作为主干特征提取网络(backbone),并在每个卷积模块中添加协调注意力机制,用于增强特征层的语义表达能力;提出一种 Fused-Sandglass 模块插入到浅层的 backbone 中,提高网络的推理速度;网络训练方面引入渐进式训练方法和 focal loss 损失函数,提升训练速度,并且有效缓解正负样本失衡的问题。实验结果表明,该方法在 15 种零件的检测任务中能够保持和 YOLOv4 网络相近的准确率,但参数量大小仅为其 20% ,推理速度达到了 43. 7 fps,能够满足实际生产的需求。

Abstract:: In order to solve the slow response speed,high hardware resource occupation and other issues of automatic detection tasks under such complex conditions as stacked adhesions and debris interference,a part detection method based on lightweight YOLOv4 network is proposed. MobileNeXt is used to replace CSPDarkNet53 as backbone, and coordinated attention mechanism is added in each convolutional module to enhance the semantic expression ability of feature maps. A Fused-Sandglass module is proposed for plugging into shallow backbone,which can improve the inference speed of network. In the aspect of network training,the progressive training method and focal loss function are introduced to improve the training speed and effectively alleviate the imbalance between positive and negative samples. Experimental results show that the proposed method can maintain the accuracy similar to that of YOLOv4 network in 15 parts detection tasks,and the number of parameters is only 20% of it. The inference speed can reach 43. 7 fps,which can meet the demand of actual production.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(08):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
　SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(08):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
　HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(08):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
　CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(08):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(08):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
　HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(08):1.
[7]徐融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
　XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(08):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
　ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(08):1.
[9]李全兵,文钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
　LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
　LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(08):7.

更新日期/Last Update: 2024-08-10

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

文章信息/Info

相似文献/References:

常用功能

导航/Navigate

工具/Tools

统计/Statistics