[1]朱文博,陈龙飞,余琦.基于协调注意力机制的轻量级YOLOv4零件检测[J].计算机技术与发展,2024,34(08):23-29.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0139]
 ZHU Wen-bo,CHEN Long-fei,YU Qi.Part Detection Based on Coordinated Attention Mechanism Lightweight YOLOv4[J].,2024,34(08):23-29.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0139]
点击复制

基于协调注意力机制的轻量级YOLOv4零件检测

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年08期
页码:
23-29
栏目:
媒体计算
出版日期:
2024-08-10

文章信息/Info

Title:
Part Detection Based on Coordinated Attention Mechanism Lightweight YOLOv4
文章编号:
1673-629X(2024)08-0023-07
作者:
朱文博陈龙飞余琦
上海理工大学 机械工程学院,上海 200093
Author(s):
ZHU Wen-boCHEN Long-feiYU Qi
School of Mechanical Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
关键词:
深度学习协调注意力机制零件检测YOLOv4网络MobileNeXt网络
Keywords:
deep learningcoordinating attention mechanismpart detectionYOLOv4 networkMobileNeXt network
分类号:
TP391.4
DOI:
10.20165/j.cnki.ISSN1673-629X.2024.0139
摘要:
针对零件自动检测任务在复杂工况下,如零件堆叠粘连、有杂物干扰等,存在实时性差、硬件资源占用大等问题,提出一种基于轻量级 YOLOv4 网络的零件检测方法。 采用 MobileNeXt 代替 CSPDarkNet53 作为主干特征提取网络(backbone),并在每个卷积模块中添加协调注意力机制,用于增强特征层的语义表达能力;提出一种 Fused-Sandglass 模块插入到浅层的 backbone 中,提高网络的推理速度;网络训练方面引入渐进式训练方法和 focal loss 损失函数,提升训练速度,并且有效缓解正负样本失衡的问题。 实验结果表明,该方法在 15 种零件的检测任务中能够保持和 YOLOv4 网络相近的准确率,但参数量大小仅为其 20% ,推理速度达到了 43. 7 fps,能够满足实际生产的需求。
Abstract:
In order to solve the slow response speed,high hardware resource occupation and other issues of automatic detection tasks under such complex conditions as stacked adhesions and debris interference,a part detection method based on lightweight YOLOv4 network is proposed. MobileNeXt is used to replace CSPDarkNet53 as backbone, and coordinated attention mechanism is added in each convolutional module to enhance the semantic expression ability of feature maps. A Fused-Sandglass module is proposed for plugging into shallow backbone,which can improve the inference speed of network. In the aspect of network training,the progressive training method and focal loss function are introduced to improve the training speed and effectively alleviate the imbalance between positive and negative samples. Experimental results show that the proposed method can maintain the accuracy similar to that of YOLOv4 network in 15 parts detection tasks,and the number of parameters is only 20% of it. The inference speed can reach 43. 7 fps,which can meet the demand of actual production.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(08):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
 SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(08):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
 HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(08):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
 CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(08):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(08):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
 HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(08):1.
[7]徐 融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
 XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(08):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
 ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(08):1.
[9]李全兵,文 钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
 LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
 LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(08):7.

更新日期/Last Update: 2024-08-10