«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2022. 11. 018]
点击复制

基于注意力机制的食物识别与定位算法()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 32
期数:: 2022年11期

页码:: 121-126

栏目:: 人工智能

出版日期:: 2022-11-10

文章信息/Info

Title:: Food Recognition and Location Algorithm Based on Attention Mechanism

文章编号:: 1673-629X(2022)11-0121-06

作者:: 彭耿; 刘宁钟; 南京航空航天大学计算机科学与技术学院,江苏南京 211106

Author(s):: PENG Geng; LIU Ning-zhong; School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China

关键词:: 食物识别与定位; 深度学习; 注意力机制; 特征融合; YOLO

Keywords:: food recognition and location; deep learning; attention mechanism; feature fusion; YOLO

分类号:: TP31

DOI:: 10. 3969 / j. issn. 1673-629X. 2022. 11. 018

摘要:: 随着食物检索、食物推荐和食物监测等应用需求的增加,食物的自动分析成为了研究的热点。由于食物种类多,存在类间差异小、类内差异大、多尺度等特点,食物识别和定位的准确率一直不高。并且目前很多研究,在食物分析任务中,推理速度慢,性能不佳。针对这些问题,结合注意力机制,提出了一个更优的主干网络,能更好地提取食物细粒度特征。同时对 Neck 部分进行研究,进行多尺度特征融合,提出了一种轻量级的端到端食物识别和定位框架 FFAM( FeatureFusion of Attention Mechanism) 。在目前具有挑战性的公开数据集 UNIMIB2016 上的实验结果表明,该算法比目前的很多方法在精度上更具有优势,最终 mAP 达到了 94. 1% 。由于得到的模型相比 YOLOv4 精度高且更小,在应对移动端、嵌入式设备中部署食物分析模型解决实际任务时,能有一个更好的性能表现。

Abstract:: With the increasing demand for applications such as food retrieval, food recommendation and food monitoring, automaticanalysis of food has become a hot research topic. The accuracy of food identification and localization has been low due to the largenumber of food types with small inter-class differences,large intra-class differences,and multiple scales. And many current studies,inthe food analysis task, have slow inference speed and poor performance. To address these problems, a better backbone network isproposed to extract food fine-grained features better by combining the attention mechanism. The Neck part is also investigated for multi-scale feature fusion,and a lightweight end-to-end food identification and localization framework FFAM is proposed. The experiments onthe current challenging public dataset UNIMIB2016 show that the proposed algorithm is more competitive than many current methods interms of accuracy,with a final mAP of 94. 1% . Since the obtained model is more accurate and smaller compared to YOLOv4,it can havea better performance when dealing with the deployment of? food analysis models in mobile and embedded devices to solve practical tasks.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(11):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
　SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(11):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
　HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(11):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
　CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(11):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(11):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
　HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(11):1.
[7]徐融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
　XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(11):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
　ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(11):1.
[9]李全兵,文钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
　LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(11):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
　LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(11):7.

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed617
全文下载/Downloads411
评论/Comments