[1]李建平,赖永倩.基于注意力机制和残差网络的视频行为识别[J].计算机技术与发展,2023,33(04):69-74.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 010]
 LI Jian-ping,LAI Yong-qian.Video Behavior Recognition Based on Attention Mechanism and Residual Network[J].,2023,33(04):69-74.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 010]
点击复制

基于注意力机制和残差网络的视频行为识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年04期
页码:
69-74
栏目:
媒体计算
出版日期:
2023-04-10

文章信息/Info

Title:
Video Behavior Recognition Based on Attention Mechanism and Residual Network
文章编号:
1673-629X(2023)04-0069-06
作者:
李建平赖永倩
东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
Author(s):
LI Jian-pingLAI Yong-qian
School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China
关键词:
深度学习残差网络三维卷积网络视频行为识别注意力机制
Keywords:
deep learningresidual networkthree dimensional convolution networkvideo behavior recognitionattention mechanism
分类号:
TP391. 4
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 04. 010
摘要:
针对现有的视频人体行为识别模型识别能力有限,以及双流识别方法易受光照因素的影响导致时间成本较高的问题,提出一种基于注意力机制的 ResNeXt 模型用于识别视频中的人体行为。 将经过预处理后的视频帧数据作为该模型的输入,该卷积网络模型使用 ResNeXt101 层网络作为核心残差块。 在 ResNeXt 三维卷积神经网络模型的基础上,通过引入注意力机制来加强重要的特征信道,提高网络模型的特性表示及稳健性。 使用 Kinetics 的预训练模型,对 UCF-101和HMDB-51数据进行了训练和学习,迭代 200 次后,在验证集上的识别率分别达到了 96.0% 和 69.9% 。 实验结果显示,该模型能有效识别视频中的时空特征,与以往的识别模型相比准确率有所提高,且在人体行为识别任务中识别率较好。该模型能在保证深层网络的同时,使特征不丢失并且防止发生过拟合,同时识别的正确率也得到了改善,证明了该模型是有效可行的。
Abstract:
In view of the limited recognition ability of the existing video human behavior recognition models and the high time cost of thedual stream recognition method due to the influence of lighting factors,we propose a ResNeXt model based on attention mechanism torecognize human behavior in video. The preprocessed video frame data is used as the input of the model, and the ResNeXt101 layernetwork is used as the core residual block by the convolution network model. On the basis of ResNeXt three-dimensional convolutionalneural network model, attention mechanism is introduced to strengthen important characteristic channels and improve the characteristic representation and robustness of the network model. We use the pre-training model of Kinetics dynamics to train and learn the data ofUCF-101 and HMDB-51. After 200 iterations,the recognition rates on the verification set reach 96. 0% and 69. 9% respectively. Theexperimental results show that such model can effectively recognize the spatiotemporal features in video,and the recognition accuracy issignificantly improved compared with the previous recognition models. Such model can not only ensure that the features are not lost,butalso prevent the occurrence of over fitting,and the accuracy of recognition has been significantly improved,which proves that the model iseffective and feasible.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
 SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(04):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
 HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(04):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
 CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(04):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(04):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
 HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(04):1.
[7]徐 融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
 XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(04):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
 ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(04):1.
[9]李全兵,文 钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
 LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(04):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
 LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(04):7.
[11]周传华,吴幸运,李 鸣.基于 WGAN 单帧人脸图像超分辨率算法[J].计算机技术与发展,2020,30(09):29.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 006]
 ZHOU Chuan-hua,WU Xing-yun,LI Ming.Single Frame Face Images Super-resolution Algorithm Based on WGAN[J].,2020,30(04):29.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 006]
[12]焦 亮,张太红*.基于深度学习身份证鉴别与信息检测方法研究[J].计算机技术与发展,2020,30(12):203.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 036]
 JIAO Liang,ZHANG Tai-hong*.Research on Identity Card Identification and Information Detection Based on Deep Learning[J].,2020,30(04):203.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 036]
[13]江佳俊,蒋 旻*,杨晓雨,等.基于注意力机制的个性化图像美学质量评估[J].计算机技术与发展,2021,31(10):56.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 010]
 JIANG Jia-jun,JIANG Min*,YANG Xiao-yu,et al.Research on Evaluation of Personalized Image Aesthetic Quality Based on Attention Mechanism[J].,2021,31(04):56.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 010]
[14]姜丽莉,黄承宁.融合注意力机制改进残差网络的表情识别方法[J].计算机技术与发展,2022,32(05):42.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 007]
 JIANG Li-li,HUANG Cheng-ning.An Expression Recognition Method Based on Fusion of Attention Mechanism and Improved Residual Network[J].,2022,32(04):42.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 007]
[15]杨朝晨,陈佳悦,邢 可,等.基于改进的 DSSD 的小目标检测算法研究[J].计算机技术与发展,2022,32(06):63.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 011]
 YANG Zhao-chen,CHEN Jia-yue,XING Ke,et al.Small Target Detection Algorithm Based on Improved DSSD[J].,2022,32(04):63.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 011]
[16]张栋昱,赵 磊.融合注意力机制改进 ResNet 的人脸表情识别[J].计算机技术与发展,2023,33(05):130.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 020]
 ZHANG Dong-yu,ZHAO Lei.Improved Facial Expression Recognition in ResNet by Integrating Attention Mechanism[J].,2023,33(04):130.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 020]

更新日期/Last Update: 2023-04-10