[1]朱联祥,牛文煜,仝文东,等.基于混合注意力机制的视频人体动作识别[J].计算机技术与发展,2023,33(09):105-112.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 016]
 ZHU Lian-xiang,NIU Wen-yu,TONG Wen-dong,et al.Video Human Action Recognition Based on Hybrid Attention Mechanism[J].,2023,33(09):105-112.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 016]
点击复制

基于混合注意力机制的视频人体动作识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年09期
页码:
105-112
栏目:
人工智能
出版日期:
2023-09-10

文章信息/Info

Title:
Video Human Action Recognition Based on Hybrid Attention Mechanism
文章编号:
1673-629X(2023)09-0105-08
作者:
朱联祥牛文煜仝文东邵浩杰
西安石油大学 计算机学院,陕西 西安 710065
Author(s):
ZHU Lian-xiangNIU Wen-yuTONG Wen-dongSHAO Hao-jie
School of Computer Science,Xi’an Shiyou University,Xi’an 710065,China
关键词:
人体动作识别三维卷积神经网络全局上下文建模远程依赖注意力机制
Keywords:
human action recognition three - dimensional convolutional neural network global context modeling long - rangedependenceattention mechanism
分类号:
TP391. 41
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 09. 016
摘要:
C3D 作为一种典型的三维卷积神经网络被应用于视频动作识别任务。 针对其存在的特征提取不足、易出现过拟合以及识别准确率较低等问题,提出一种融合混合注意力机制的 C3D 三维卷积网络模型。 在原 C3D 网络插入由 GCNet通道注意力模块和 3D-Crisscross 空间注意力模块构建的混合注意力模块,这两种注意力网络具有全局上下文建模操作,能够对三维特征建立远程依赖关系,加强网络对视频特征在通道和空间上的特征提取能力,提高模型的分类性能。 将所提方法在 UCF-101 和 HMDB-51 两
个大型视频数据集上进行测试,并与深度学习的其他模型进行比较,结果表明,该方法相对于其他深度学习模型具有相对更高的准确率,在 UCF-101 和 HMDB-51 数据集上
的识别准确率可以达到 96. 7%和 63. 3% ,而且与原 C3D 方法相比在效果上有明显提升。
Abstract:
As a typical three-dimensional convolutional neural network,C3D has been used in video action recognition tasks widely. Toaddress the issues coming with existing C3D based action recognition methods, such as insufficient feature extraction, prone to overfitting,low recognition accuracy,etc. ,a new C3D based network model with the introducing of hybrid attention mechanism fusion is proposed. A hybrid attention module constructed by GCNet channel attention module and 3D-Crisscross spatial attention module is insertedinto the original C3D network. These two attention networks have global context modeling operations,can establish remote dependencies on 3D features,strengthen the network’ s ability to extract video features in channel and space,and improve the classification performanceof the model.?
The performance of proposed method has been tested on two large video datasets,i. e. UCF-101 and HMDB-51,with thecomparison to other deep learning models. Experimental results show the proposed method has a higher recognition accuracy than that ofother deep learning models. The recognition accuracy of UCF - 101 and HMDB - 51 data sets can reach 96. 7% and 63. 3% , with asignificant improvement in vision effect compare to original C3D method.

相似文献/References:

[1]宫法明,马玉辉.基于时空双分支网络的人体动作识别研究[J].计算机技术与发展,2020,30(09):23.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 005]
 GONG Fa-ming,MA Yu-hui.Research on Human Action Recognition Based on Space-time Double-branch Network[J].,2020,30(09):23.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 005]
[2]杨靖祎,谢 洋,周晓叶,等.基于 3D CNN 的肺结节假阳性筛查模型[J].计算机技术与发展,2022,32(02):196.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 032]
 YANG Jing-yi,XIE Yang,ZHOU Xiao-ye,et al.False Positive Screening of Pulmonary Nodules with 3D CNN[J].,2022,32(09):196.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 032]
[3]高 鹏,张 岩,唐新余,等.结合注意力机制的雷达多信号动作识别方法[J].计算机技术与发展,2023,33(01):157.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 024]
 GAO Peng,ZHANG Yan,TANG Xin-yu,et al.Radar Multi-signal Action Recognition Method Based on Attention Mechanism[J].,2023,33(09):157.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 024]

更新日期/Last Update: 2023-09-10