[1]白科,史伟,赵心怡,等.基于FA-ConvNeXt和小样本学习的唐卡主尊识别[J].计算机技术与发展,2025,(06):27-33.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0022]
 BAI Ke,SHI Wei,ZHAO Xin-yi,et al.Thangka Recognition via FA-ConvNeXt and Few-shot Learning[J].,2025,(06):27-33.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0022]
点击复制

基于FA-ConvNeXt和小样本学习的唐卡主尊识别()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年06期
页码:
27-33
栏目:
媒体计算
出版日期:
2025-06-10

文章信息/Info

Title:
Thangka Recognition via FA-ConvNeXt and Few-shot Learning
文章编号:
1673-629X(2025)06-0027-07
作者:
白科史伟赵心怡徐家明
宁夏大学 信息工程学院,宁夏 银川 750021
Author(s):
BAI KeSHI WeiZHAO Xin-yiXU Jia-ming
School of Information Engineering,Ningxia University,Yinchuan 750021,China
关键词:
唐卡主尊图像识别注意力多尺度特征增强小样本学习ConvNeXt网络
Keywords:
Thangka main deity recognitionattentionmulti-scale feature enhancementfew-shot learningConvNeXt network
分类号:
TP391-41;TN911.73-34
DOI:
10.20165/j.cnki.ISSN1673-629X.2025.0022
摘要:
针对唐卡主尊图像识别过程中,由于图像结构和纹理特征复杂、颜色绚丽且部分构图元素具有较高相似度而造成识别类别混淆的问题,提出了 FA-ConvNeXt 网络。 首先,对于目前分类方法存在的数据集类别少、数量不平衡等问题,通过查阅资料和采用数据增强方法来扩充数据集。 为了提高网络的分类准确度,在 ConvNeXt 网络架构上引入多尺度特征增强模块(MFEB),使网络更好地提取图像的结构和纹理特征,同时构建多注意力特征提取模块(MAEB),使网络更加关注具有判别性的特征,以减少冗余信息的干扰。 通过实验与相关主流模型进行比较,结果表明,提出的 FA-ConvNeXt 网络识别准确率、召回率及 F1 值分别达到了97.26% 、97.18%、96.38%,较原网络分别提升了 7. 35 百分点、6. 94 百分点、6. 17百分点,且均优于被对比模型。 最后将 FA-ConvNeXt 网络作为唐卡小样本学习的骨干网络,在小样本分类任务中也取得了良好的效果。
Abstract:
To address the issue of classification confusion in Thangka central deity image recognition due to complex textures,vibrant colors,and similar compositional elements,we introduce the FA-ConvNeXt network. Firstly,to tackle the problem of few categories and imbalanced quantities in current classification methods, the dataset is expanded through literature review and data augmentation techniques. To enhance the classification accuracy of the network,a Multi-scale Feature Enhancement Block (MFEB) is introduced on the ConvNeXt network architecture to better extract the structural and textural features of images. Additionally,Multi-Attention feature Extraction Block (MAEB) that integrates channel and multispectral channel attention is added to focus the network on discriminative features,thereby reducing the interference of redundant information.Experimental results compared with mainstream models show that the proposed FA-ConvNeXt network achieved recognition accuracy,recall,and F1 scores of 97. 26% ,97. 18% ,and 96. 38% ,respectively,which are 7. 35 percentage points,6. 94 percentage points,and 6. 17 percentage points higher than the original network and superior to the compared models.Finally,the FA-ConvNeXt network is used as the backbone network for few-shot learning and has also achieved good results in few-shot classification tasks.

相似文献/References:

[1]庄兴旺,丁岳伟.多维度注意力和语义再生的文本生成图像模型[J].计算机技术与发展,2020,30(12):27.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 005]
 ZHUANG Xing-wang,DING Yue-wei.Text-to-image Model by Multidimensional Attention and Semantic Regeneration[J].,2020,30(06):27.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 005]
[2]刘 洋,杨小军.基于孪生网络特征融合与阈值更新的跟踪算法[J].计算机技术与发展,2022,32(03):65.[doi:10. 3969 / j. issn. 1673-629X. 2022. 03. 011]
 LIU Yang,YANG Xiao-jun.Tracking Algorithm Based on Twin Networks Feature Fusion and Threshold Update[J].,2022,32(06):65.[doi:10. 3969 / j. issn. 1673-629X. 2022. 03. 011]
[3]魏宗琪,梁 栋.视频中稳定的跨场景前景分割[J].计算机技术与发展,2022,32(12):37.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 006]
 WEI Zong-qi,LIANG Dong.Stable Cross-scene Foreground Segmentation in Video[J].,2022,32(06):37.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 006]
[4]苗壮,王培龙,崔浩然,等.融合全局上下文关联特征的细粒度图像分类[J].计算机技术与发展,2024,34(06):29.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0067]
 MIAO Zhuang,WANG Pei-long,CUI Hao-ran,et al.Fine-grained Image Classification Based on Fusion of Global Contextual Features[J].,2024,34(06):29.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0067]

更新日期/Last Update: 2025-06-10