«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.ISSN1673-629X.2024.0369]
点击复制

融合相似交叉熵和知识蒸馏的人脸年龄估计方法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:: 2025年04期

页码:: 113-120

栏目:: 人工智能

出版日期:: 2025-04-10

文章信息/Info

Title:: Age Estimation of Face Images by Fusing Similarity Cross Entropy and Knowledge Distillation

文章编号:: 1673-629X(2025)04-0113-08

作者:: 杨文茵; 黄蔼权; 谭振林; 刘子良; 钟勇*; 佛山大学电子信息工程学院,广东佛山 528225

Author(s):: YANG Wen-yin; HUANG Ai-quan; TAN Zhen-lin; LIU Zi-liang; ZHONG Yong*; School of Electronic Information Engineering,Foshan University,Foshan 528225,China

关键词:: 深度学习; 年龄估计; 视觉Transformer; 交叉熵; 知识蒸馏

Keywords:: deep learning; age estimation; Vision Transformer; cross-entropy; knowledge distillation

分类号:: TP18

DOI:: 10.20165/j.cnki.ISSN1673-629X.2024.0369

摘要:: 人脸年龄估计在安防、人机交互和智能推荐等领域扮演着至关重要的角色。然而,目前存在的人脸年龄估计方法面临着提取年龄特征困难的挑战,从而导致年龄估计模型的预测误差较大。此外,现有的人脸年龄估计模型通常规模较大,使得在移动端难以实现有效部署。为了解决上述问题,该文首先提出了 SimViT-Age,一种基于视觉 Transformer(ViT)作为主干网络的年龄估计模型,以获取图像中高质量的年龄特征。通过引入改进的相似交叉熵损失函数,成功优化了模型,与其他先进方法相比,该方法在 CACD 和 UTKFace 数据集上的平均绝对误差(MAE)分别降低了 0. 05 和 0. 14。其次,采用知识蒸馏技术对 SimViT-Age 进行压缩,以解决年龄估计模型结构繁杂、参数众多和计算冗余等问题。结果表明:在大约牺牲不超过 0. 5 的 MAE 情况下,模型大小,参数量和计算量均降低了 90% 以上。这一创新性方法不仅提高了模型性能,还使其更适用于移动端应用。

Abstract:: The estimation of face age is critical in fields like security,human-computer interaction and smart recommendations. However,current face age estimation methods struggle with accurately extracting age features,resulting in significant prediction errors in these models. In addition,existing face age estimation models are typically large,posing challenges for efficient deployment on mobile devices.To address these issues,we first propose SimViT-Age,an age estimation model based on Visual Transformer ( ViT) as a backbone network to obtain high-quality age features in images. By introducing an improved similar cross-entropy loss function,the model’s opti-mization reduces MAE by 0. 05 on CACD and 0. 14 on UTKFace,outperforming state-of-the-art methods. Second,we apply knowledge distillation to compress SimViT - Age, addressing complex model structure, excessive parameters, and redundant computations in age estimation. The results show that the model size, number of parameters and computation are reduced by more than 90% at the approximate sacrifice of no more than 0. 5 MAE. This innovative approach not only improves the model performance,but also makes it more suitable for mobile applications.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
　SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(04):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
　HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(04):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
　CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(04):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(04):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
　HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(04):1.
[7]徐融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
　XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(04):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
　ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(04):1.
[9]李全兵,文钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
　LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(04):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
　LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(04):7.

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed190
全文下载/Downloads169
评论/Comments