[1]吴华涛,朱子奇.基于通道相关性的类注意力知识蒸馏[J].计算机技术与发展,2024,34(12):125-131.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0242]
 WU Hua-tao,ZHU Zi-qi.Class Attention Knowledge Distillation Based on Channel Correlation[J].,2024,34(12):125-131.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0242]
点击复制

基于通道相关性的类注意力知识蒸馏()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年12期
页码:
125-131
栏目:
人工智能
出版日期:
2024-12-10

文章信息/Info

Title:
Class Attention Knowledge Distillation Based on Channel Correlation
文章编号:
1673-629X(2024)12-0125-07
作者:
吴华涛朱子奇
武汉科技大学 计算机科学与技术学院,湖北 武汉 430065
Author(s):
WU Hua-taoZHU Zi-qi
School of Computer Science & Technology,Wuhan University of Science and Technology,Wuhan 430065,China
关键词:
知识蒸馏通道相关性类激活图通道知识注意力蒸馏
Keywords:
knowledge distillationchannel correlationclass activation mapschannel knowledgeattention distillation
分类号:
TP391.4
DOI:
10.20165/j.cnki.ISSN1673-629X.2024.0242
摘要:
以前的知识蒸馏方法在模型压缩上表现出了令人印象深刻的性能,其中在基于类注意力转移的知识蒸馏(CAT-KD)这项工作中证明了通过转移类激活图可以使学生模型获得和增强识别输入分类区域的能力,这种能力是当前主流 CNN 模型进行分类的关键。 其通过将类激活图平均池化和 l2 归一化的方式来转移类激活图,从而增强学生模型识别输入分类区域的能力,提高蒸馏性能。 然而,这种方式忽略了类激活图中的通道相关的知识,这对于学生模型学习识别输入分类区域的能力至关重要。 为了解决上述问题,该文提出了基于通道相关性的类注意力转移方法。 具体来说,为了从类激活图中提取丰富的知识,该方法不仅考虑了样本内的类激活图中不同通道的特征知识,还考虑了不同样本的类激活图中基于每通道特征的关系知识。 实验表明,该方法在 CIFAR-100 数据集上比基准方法提升了 0. 96 百分点,优于对比方法。
Abstract:
Previous knowledge distillation methods have shown impressive performance in model compression,among which in the work of Class Attention Transfer Based Knowledge Distillation (CAT-KD),it has been proven that the transfer class activation graph can enable student models to acquire and enhance the ability to recognize input classification regions,which is the key to current mainstream CNN models for classification. It enhances the ability of the student model to recognize input classification regions and improves distillation performance by transferring class activation maps through average pooling and l2 normalization. However, this approach ignores the channel related knowledge in the class activation maps,which is crucial for the student model’s ability to learn and recognize input classification regions. To address the aforementioned issues, we propose a class attention transfer method based on channel correlation. Specifically,in order to extract rich knowledge from class activation maps,the proposed method not only considers the feature knowledge of different channels in the class activation maps within the samples,but also considers the relationship knowledge based on each channel feature in the class activation maps of different samples. The experiment shows that the proposed method has improved by 0. 96 percentage points compared to the benchmark method on the CIFAR-100 dataset,which is better than the comparison method.

相似文献/References:

[1]章佳琪,肖 建 *.DID-YOLO:一种适用于嵌入式设备的移动机器人目标检测算法[J].计算机技术与发展,2023,33(10):8.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 002]
 ZHANG Jia-qi,XIAO Jian *.DID-YOLO:A Mobile Robot Target Detection Algorithm for Embedded Devices[J].,2023,33(12):8.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 002]
[2]陈 颖,朱子奇,徐仕成,等.基于类间排名相关性的解耦知识蒸馏[J].计算机技术与发展,2024,34(01):52.[doi:10. 3969 / j. issn. 1673-629X. 2024. 01. 008]
 CHEN Ying,ZHU Zi-qi,XU Shi-cheng,et al.Decoupled Knowledge Distillation Based on Inter-class Ranking Correlation[J].,2024,34(12):52.[doi:10. 3969 / j. issn. 1673-629X. 2024. 01. 008]
[3]邵春阳,刘宁钟.基于教师模型正确预测的图像分类知识蒸馏算法[J].计算机技术与发展,2025,(06):94.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0021]
 SHAO Chun-yang,LIU Ning-zhong.Knowledge Distillation Algorithm for Image Classification Based on Correct Predictions of Teacher Model[J].,2025,(12):94.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0021]
[4]王纪康,赵旭俊.基于知识蒸馏的图像异常检测方法[J].计算机技术与发展,2024,34(05):149.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0053]
 WANG Ji-kang,ZHAO Xu-jun.An Image Anomaly Detection Method Based on Knowledge Distillation[J].,2024,34(12):149.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0053]
[5]王一评,王改华.智能看护场景下的婴幼儿表情识别[J].计算机技术与发展,2025,(02):16.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0307]
 WANG Yi-ping,WANG Gai-hua.Infants Expression Recognition in Intelligent Nursing Scene[J].,2025,(12):16.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0307]
[6]杨文茵,黄蔼权,谭振林,等.融合相似交叉熵和知识蒸馏的人脸年龄估计方法[J].计算机技术与发展,2025,(04):113.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0369]
 YANG Wen-yin,HUANG Ai-quan,TAN Zhen-lin,et al.Age Estimation of Face Images by Fusing Similarity Cross Entropy and Knowledge Distillation[J].,2025,(12):113.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0369]
[7]石迎澳,李润知,姬怡.基于联邦全局知识蒸馏的异常网络入侵检测方法[J].计算机技术与发展,2025,(07):55.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0087]
 SHI Ying-ao,LI Run-zhi,JI Yi.Anomaly Network Intrusion Detection Method Based on Federated Learning with Global Knowledge Distillation[J].,2025,(12):55.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0087]

更新日期/Last Update: 2024-12-10