[1]朱 越,黄海于,罗学义.基于实时视频流的 3D 人体姿势和形状估计[J].计算机技术与发展,2024,34(04):42-47.[doi:10. 3969 / j. issn. 1673-629X. 2024. 04. 007]
 ZHU Yue,HUANG Hai-yu,LUO Xue-yi.3D Human Pose and Shape Estimation Based on Live Video Streams[J].,2024,34(04):42-47.[doi:10. 3969 / j. issn. 1673-629X. 2024. 04. 007]
点击复制

基于实时视频流的 3D 人体姿势和形状估计()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年04期
页码:
42-47
栏目:
媒体计算
出版日期:
2024-04-10

文章信息/Info

Title:
3D Human Pose and Shape Estimation Based on Live Video Streams
文章编号:
1673-629X(2024)04-0042-06
作者:
朱 越1 黄海于1 罗学义2
1. 西南交通大学 计算机与人工智能学院,四川 成都 611756;
2. 消防救援局昆明训练总队,云南 昆明 650217
Author(s):
ZHU Yue1 HUANG Hai-yu1 LUO Xue-yi2
1. School of Computer and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China;
2. Kunming Training Corps of Fire Rescue Bureau,Kunming 650217,China
关键词:
三维人体重建SMPL 模型实时特征注意力集成图卷积神经网络机器学习
Keywords:
3D human reconstruction skinned multi - person linear model real - time feature attention integration graph convolutional neural networkmachine learning
分类号:
TP391. 41
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 04. 007
摘要:
为满足元宇宙、游戏及虚拟现实等应用场景中对实时视频流 3D 人体姿势和形状估计准确性和真实性的要求,提出了一种基于时间注意力机制的 3D 人体姿势和形状估计方法。?
首先,提取图像特征,并将其输入运动连续注意力模块以更好地校准需要注意的时间序列范围;随后,使用实时特征注意力集成模块以有效地组合当前帧与过去帧的特征表示;最后,
通过人体参数回归网络得到最终结果,并使用基于图卷积的生成对抗网络判断模型是否来自真实的人体运动数据。相较于之前基于实时视频流的方法,在主流数据集上加速度误
差平均减少了 30% 的同时,网络参数与计算量减少了 65% ,在实际测试中实现了每秒 55 ~ 60 帧的 3D 人体姿态和形状估计速度,为元宇宙、游戏及虚拟现实等应用场景提供更
好的用户体验和更高的应用价值。
Abstract:
A 3D human body pose and shape estimation method based on a temporal attention mechanism is proposed to meet therequirements of real-time accuracy and realism?
in 3D human body pose and shape estimation for applications such as the metaverse,gaming,and virtual reality. First,image features are extracted and input into a motion?
continuity attention module to better calibrate thetime sequence range that requires attention. Then,a real - time feature attention integration module is used to effectively?
combine the feature representations of the current frame and past frames. Finally,the human parameter regression network is used to obtain the finalresults,and a graph convolutional generative adversarial network is used to determine whether the model comes from real human motiondata. Compared with previous methods based on real -
time video streams, the proposed method reduces the acceleration error by anaverage of 30% on mainstream datasets,while reducing the network parameters and comput-ational complexity by 65% . The proposedmethod achieves a 3D human body pose and shape estimation speed of 55 ~ 60 frames per second in practical tests,providing?
better userexperience and higher application value for applications such as the metaverse,gaming,and virtual reality.
更新日期/Last Update: 2024-04-10