[1]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53-58.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(06):53-58.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
点击复制

基于视频场景深度学习的人物语义识别模型()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
28
期数:
2018年06期
页码:
53-58
栏目:
智能、算法、系统工程
出版日期:
2018-06-10

文章信息/Info

Title:
Human Semantic Recognition Model Based on Video Scene Deep Learning
文章编号:
1673-629X(2018)06-0053-06
作者:
高翔1 陈志1 岳文静2 龚凯1
1. 南京邮电大学 计算机学院,江苏 南京 210023;
2. 南京邮电大学 通信与信息工程学院,江苏 南京 210003
Author(s):
GAO Xiang 1 CHEN Zhi 1 YUE Wen-jing 2 GONG Kai 1
1. School of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;
2. School of Telecommunications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
关键词:
视频挖掘深度学习卷积神经网络人物语义支持向量机
Keywords:
video miningdeep learningconvolution neural networkhuman semanticsSVM
分类号:
TP393
DOI:
10.3969/ j. issn.1673-629X.2018.06.012
文献标志码:
A
摘要:
为有效分析和整合与人物行为相关的视频语义线索,提出一种基于视频场景深度学习的人物语义识别模型。 该模型由中层语义特征提取、多通道语义特征融合、整体精调和语义识别等组成。 首先实现底层图像到中层特征抽取,利用卷积神经网络算法并行获取视频场景关键帧集中的人物身份、人物行为、上下文环境等通道语义;再将中层特征融合到同一个语义融合层,通过多层语义卷积神经网络来整合上述语义,使用损失函数来学习不同通道语义之间的潜在关系,提高人物语义融合的鲁棒性;最终通过大间隔的损失函数来精调整个网络的参数,利用 SVM 分类器完成视频人物语义识别。实验结果表明,该模型在特定的数据集上具有较高的准确率,能够高效地识别视频人物语义。
Abstract:
In order to effectively analyze and integrate the video semantic clues related to human behavior,we propose a human semantic recognition model based on the video scene deep learning. The model consists of middle layer semantic feature extraction,multi-channel semantic feature fusion,global fine-tuning and semantic recognition. Firstly,it achieves the extraction of low-level image to the middle layer feature,and uses the convolutional neural network algorithm to concurrently extract the channel semantics of human identity,human
behavior,and context in video scene key frame set. Then it fuses the middle layer features to the same semantic fusion layer,integrating those semantics through the multilayer semantic convolutional neural network,and uses the loss function to learn the potential relationship among the different semantic channels,so as to improve the robustness of human semantic fusion. Finally it fine-tunes the whole network parameters through the large interval loss function,and uses SVM classifier to complete video human semantic recognition. Experiments show that the proposed model has a high-accuracy rate on the specific data set,and can effectively recognize the video human semantic.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(06):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
 SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(06):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
 HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(06):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
 CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(06):19.
[5]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
 HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(06):1.
[6]徐 融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
 XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(06):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[7]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
 ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(06):1.
[8]李全兵,文 钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
 LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(06):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[9]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
 LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(06):7.
[10]曲之琳,胡晓飞.基于改进激活函数的卷积神经网络研究[J].计算机技术与发展,2017,27(12):77.[doi:10.3969/ j. issn.1673-629X.2017.12.017]
 QU Zhi-lin,HU Xiao-fei.Research on Convolutional Neural Network Based on Improved Activation Function[J].,2017,27(06):77.[doi:10.3969/ j. issn.1673-629X.2017.12.017]

更新日期/Last Update: 2018-08-16