[1]邵春阳,刘宁钟.基于教师模型正确预测的图像分类知识蒸馏算法[J].计算机技术与发展,2025,(06):94-99.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0021]
 SHAO Chun-yang,LIU Ning-zhong.Knowledge Distillation Algorithm for Image Classification Based on Correct Predictions of Teacher Model[J].,2025,(06):94-99.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0021]
点击复制

基于教师模型正确预测的图像分类知识蒸馏算法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年06期
页码:
94-99
栏目:
人工智能
出版日期:
2025-06-10

文章信息/Info

Title:
Knowledge Distillation Algorithm for Image Classification Based on Correct Predictions of Teacher Model
文章编号:
1673-629X(2025)06-0094-06
作者:
邵春阳1刘宁钟12
1. 南京航空航天大学 计算机科学与技术学院,江苏 南京 211106;
2. 南京航空航天大学 模式分析与机器智能工业和信息化部重点实验室,江苏 南京 211106
Author(s):
SHAO Chun-yang1LIU Ning-zhong12
1. School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;
2. MIIT Key Laboratory of Pattern Analysis and Machine Intelligence,Nanjing 211106,China
关键词:
深度学习模型压缩知识蒸馏图像分类卷积神经网络
Keywords:
deep learningmodel compressionknowledge distillationimage classificationconvolutional neural network
分类号:
TP391.4
DOI:
10.20165/j.cnki.ISSN1673-629X.2025.0021
摘要:
知识蒸馏旨在简化大型模型的复杂性。 Logit 蒸馏是一种普遍采用的策略,它利用教师模型的输出 Logit 作为软标签来辅助学生模型的训练。 尽管这种算法在很多情况下都非常有效,但教师模型在预测时的错误同样会对知识转移产生不利影响。 换句话说,教师可能会将错误的知识传递给学生,这可能会导致学生模型在模仿教师时产生偏差,从而影响其最终的判断力。 为了解决这一问题,提出了一种基于教师正确预测的知识蒸馏算法。 该算法能够过滤教师预测错误的实例,使学生模型专注于正确预测的实例,提高教师的教学质量。 此外,该算法不仅能够传递不同实例中蕴含的信息,还能传递类别级别的信息,这为学生模型提供了更全面的语义信息。 实验表明,该算法在 CIFAR-100 和 ImageNet 等多个数据集上均显著提升了学生模型的分类准确率,验证了该算法的有效性。
Abstract:
Knowledge distillation aims to simplify the complexity of large models. Logit distillation is a commonly adopted strategy,which utilizes the output Logit of the teacher model as a soft label to assist in the training of the student model. Although the algorithm is highly effective in many cases,the errors of the teacher model in prediction can also have an adverse influence on knowledge transfer. In other words,teachers may pass on incorrect knowledge to students,which may cause the student model to deviate when imitating the teacher,thereby affecting its final judgment. To solve the problem,we introduce a simple yet effective algorithm based on correct instance predictions (CIPKD),which can filter the instances where teachers make incorrect predictions,enabling the student model to focus on the instances with correct predictions and improving the teaching quality of teachers. Moreover,CIPKD can not only convey the information contained in different instances,but also convey the information at the category level,which provides more comprehensive semantic infor-mation for the student model. Experiments show that the proposed algorithm significantly improves the classification accuracy of the student model on multiple datasets such as CIFAR-100 and ImageNet,verifying its effectiveness.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(06):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
 SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(06):55.[doi:10.3969/j.issn.1673-629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
 HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(06):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
 CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(06):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
 HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(06):1.
[7]徐 融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
 XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(06):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
 ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(06):1.
[9]李全兵,文 钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
 LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(06):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
 LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(06):7.
[11]陈莉君,李 卓.基于深度神经压缩的 YOLO 优化[J].计算机技术与发展,2019,29(12):72.[doi:10. 3969 / j. issn. 1673-629X. 2019. 12. 013]
 CHEN Li-jun,LI Zhuo.YOLO Optimization Based on Deep Neural Compression[J].,2019,29(06):72.[doi:10. 3969 / j. issn. 1673-629X. 2019. 12. 013]
[12]张佳钰,寇金桥,刘宁钟.基于滤波器分布拟合的神经网络剪枝算法[J].计算机技术与发展,2022,32(12):136.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 021]
 ZHANG Jia-yu,KOU Jin-qiao,LIU Ning-zhong.Deep Convolutional Neural Networks Pruning Algorithm Based on Filter Pruning via Distribution Fitting[J].,2022,32(06):136.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 021]

更新日期/Last Update: 2025-06-10