«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2022. 08. 007]
点击复制

基于自注意力机制的视频超分辨率重建()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 32
期数:: 2022年08期

页码:: 42-48

栏目:: 图形与图像

出版日期:: 2022-08-10

文章信息/Info

Title:: Video Super-resolution Reconstruction Based on Self Attention Mechanism

文章编号:: 1673-629X(2022)08-0042-07

作者:: 秦昊宇; 葛瑶; 张力波; 吴学致; 任卫军; 长安大学信息工程学院,陕西西安 710064

Author(s):: QIN Hao-yu; GE Yao; ZHANG Li-bo; WU Xue-zhi; REN Wei-jun; School of Information Engineering,Chang’an University,Xi’an 710064,China

关键词:: 视频超分辨率重建; 深度学习; 残差神经网络; 视频插值; 多对齐融合; 自注意力机制

Keywords:: video super-resolution reconstruction; deep learning; residual neural network; video interpolation; multi alignment fusion; selfattention mechanism

分类号:: TP399

DOI:: 10. 3969 / j. issn. 1673-629X. 2022. 08. 007

摘要:: 现有的视频超分辨率重建方法虽然对提高视频分辨率取得了良好效果,但是很多方法没有充分考虑视频帧间运动时间域与空间域的关联性。针对这个问题,提出一种融合时间和空间域的视频超分辨率重建模型 VTSSR,用于在同一个网络模型中同时对视频进行时间和空间域超分辨率重建。该模型使用卷积层和多个残差块对低帧率、低分辨率视频进行特征提取,通过特征插值生成中间帧的特征图,采用改进的基于自注意力机制模块同时融合特征图时间和空间信息,采用亚像素卷积上采样重建得到高帧率的高分辨率视频。 VTSSR 模型在 Vid4 数据集测试表明,其能够克服光流预测难以处理遮挡、复杂运动的局限性,还能解决不同相邻帧对于关键帧重建贡献不同的问题,提高了视频超分辨率重建水平。

Abstract:: Although the existing video super - resolution reconstruction methods have achieved excellent results in improving videoresolution,many methods do not fully take into account the correlation between video frame motion time domain and space domain. Tosolve this problem,a video super-resolution reconstruction model VTSSR integrating time and space domain is proposed to reconstructvideo in time and space domain at the same time in the same network model. The model uses convolution layer and multiple residualblocks to extract the features of low frame rate and low resolution video,generates the feature map of intermediate frame through featureinterpolation,uses the improved self attention mechanism module to fuse the temporal and spatial information of the feature map at thesame time, and uses sub-pixel convolution up sampling to reconstruct the high frame rate and high resolution video. The test of VTSSRmodel on Vid4 data set shows that it can overcome the limitations of optical flow prediction that it is difficult to deal with occlusion andcomplex motion,solve the problem of different contributions of different adjacent frames to key frame reconstruction,and improve thelevel of video super-resolution reconstruction.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(08):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
　SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(08):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
　HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(08):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
　CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(08):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(08):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
　HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(08):1.
[7]徐融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
　XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(08):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
　ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(08):1.
[9]李全兵,文钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
　LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
　LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(08):7.

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed1616
全文下载/Downloads817
评论/Comments