«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.ISSN1673-629X.2025.0022]
点击复制

基于FA-ConvNeXt和小样本学习的唐卡主尊识别()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:: 2025年06期

页码:: 27-33

栏目:: 媒体计算

出版日期:: 2025-06-10

文章信息/Info

Title:: Thangka Recognition via FA-ConvNeXt and Few-shot Learning

文章编号:: 1673-629X(2025)06-0027-07

作者:: 白科; 史伟; 赵心怡; 徐家明; 宁夏大学信息工程学院,宁夏银川 750021

Author(s):: BAI Ke; SHI Wei; ZHAO Xin-yi; XU Jia-ming; School of Information Engineering,Ningxia University,Yinchuan 750021,China

关键词:: 唐卡主尊图像识别; 注意力; 多尺度特征增强; 小样本学习; ConvNeXt网络

Keywords:: Thangka main deity recognition; attention; multi-scale feature enhancement; few-shot learning; ConvNeXt network

分类号:: TP391-41;TN911.73-34

DOI:: 10.20165/j.cnki.ISSN1673-629X.2025.0022

摘要:: 针对唐卡主尊图像识别过程中,由于图像结构和纹理特征复杂、颜色绚丽且部分构图元素具有较高相似度而造成识别类别混淆的问题,提出了 FA-ConvNeXt 网络。首先,对于目前分类方法存在的数据集类别少、数量不平衡等问题,通过查阅资料和采用数据增强方法来扩充数据集。为了提高网络的分类准确度,在 ConvNeXt 网络架构上引入多尺度特征增强模块(MFEB),使网络更好地提取图像的结构和纹理特征,同时构建多注意力特征提取模块(MAEB),使网络更加关注具有判别性的特征,以减少冗余信息的干扰。通过实验与相关主流模型进行比较,结果表明,提出的 FA-ConvNeXt 网络识别准确率、召回率及 F1 值分别达到了97.26% 、97.18%、96.38%,较原网络分别提升了 7. 35 百分点、6. 94 百分点、6. 17百分点,且均优于被对比模型。最后将 FA-ConvNeXt 网络作为唐卡小样本学习的骨干网络,在小样本分类任务中也取得了良好的效果。

Abstract:: To address the issue of classification confusion in Thangka central deity image recognition due to complex textures,vibrant colors,and similar compositional elements,we introduce the FA-ConvNeXt network. Firstly,to tackle the problem of few categories and imbalanced quantities in current classification methods, the dataset is expanded through literature review and data augmentation techniques. To enhance the classification accuracy of the network,a Multi-scale Feature Enhancement Block (MFEB) is introduced on the ConvNeXt network architecture to better extract the structural and textural features of images. Additionally,Multi-Attention feature Extraction Block (MAEB) that integrates channel and multispectral channel attention is added to focus the network on discriminative features,thereby reducing the interference of redundant information.Experimental results compared with mainstream models show that the proposed FA-ConvNeXt network achieved recognition accuracy,recall,and F1 scores of 97. 26% ,97. 18% ,and 96. 38% ,respectively,which are 7. 35 percentage points,6. 94 percentage points,and 6. 17 percentage points higher than the original network and superior to the compared models.Finally,the FA-ConvNeXt network is used as the backbone network for few-shot learning and has also achieved good results in few-shot classification tasks.

相似文献/References:

[1]庄兴旺,丁岳伟.多维度注意力和语义再生的文本生成图像模型[J].计算机技术与发展,2020,30(12):27.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 005]
　ZHUANG Xing-wang,DING Yue-wei.Text-to-image Model by Multidimensional Attention and Semantic Regeneration[J].,2020,30(06):27.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 005]
[2]刘洋,杨小军.基于孪生网络特征融合与阈值更新的跟踪算法[J].计算机技术与发展,2022,32(03):65.[doi:10. 3969 / j. issn. 1673-629X. 2022. 03. 011]
　LIU Yang,YANG Xiao-jun.Tracking Algorithm Based on Twin Networks Feature Fusion and Threshold Update[J].,2022,32(06):65.[doi:10. 3969 / j. issn. 1673-629X. 2022. 03. 011]
[3]魏宗琪,梁栋.视频中稳定的跨场景前景分割[J].计算机技术与发展,2022,32(12):37.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 006]
　WEI Zong-qi,LIANG Dong.Stable Cross-scene Foreground Segmentation in Video[J].,2022,32(06):37.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 006]
[4]苗壮,王培龙,崔浩然,等.融合全局上下文关联特征的细粒度图像分类[J].计算机技术与发展,2024,34(06):29.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0067]
　MIAO Zhuang,WANG Pei-long,CUI Hao-ran,et al.Fine-grained Image Classification Based on Fusion of Global Contextual Features[J].,2024,34(06):29.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0067]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed218
全文下载/Downloads143
评论/Comments