基于双注意力机制的零样本建筑图像分类方法-《计算机技术与发展》

文章信息/Info

Title:: Zero-shot Architectural Image Classification Method Based on Dual Attention Mechanism

文章编号:: 1673-629X(2023)10-0035-07

作者:: 宁园园; 张素兰; 陈飞; 太原科技大学计算机科学与技术学院,山西太原 030024

Author(s):: NING Yuan-yuan; ZHANG Su-lan; CHEN Fei; School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China

关键词:: 建筑风格分类; 零样本学习; 双注意力机制; 通道注意力; 空间注意力; 空间映射

Keywords:: architectural style classification; zero-shot learning; dual attention mechanism; channel attention; spatial attention; space map鄄ping

分类号:: TP183

DOI:: 10. 3969 / j. issn. 1673-629X. 2023. 10. 006

摘要:: 零样本建筑图像分类是在标记训练样本不足以涵盖所有类的情况下,利用已知建筑类别与未知建筑类别之间的知识迁移对未知类样本进行分类。针对建筑风格分类中标记数据少及局部判别性特征定位不准确的问题,提出一种基于双注意力机制的零样本图像分类方法。该方法首先引入通道注意和空间注意两种模型以增强图像特定区域的表示。其中,通道注意网络学习不同通道权重以定位图像中的建筑物;空间注意网络将位置信息嵌入通道注意图捕获目标中的细节特征,获取具有通道和空间双层维度的特征表示。其次,为减少空间映射过程中出现的信息损失,使用生成器重建视觉特征。最后,设计公共空间嵌入的零样本建筑图像分类模型,在子空间对齐视觉特征和语义特征,通过最近邻匹配实现分类任务。实验结果表明,所提方法较当前零样本学习方法而言,在零样本数据集 CUB 及建筑风格数据集 Architecture StyleDataset 上的平均分类准确率分别提高 1. 3 和 0. 7 百分点。

Abstract:: Zero - shot architectural image classification is to use the knowledge transfer between known architectural categories andunknown architectural categories to classify unknown class samples when the labeled training samples are not enough to cover all classes.Aiming at the problems of less labeled data and inaccurate localization of local discriminative features in architectural style classification,zero-shot image classification method based on dual attention mechanism is proposed. Firstly,two models of channel attention and spatialattention are introduced to enhance the representation of specific regions of the image. Among them,the channel attention network learnsdifferent channel weights to locate the buildings in the image, the spatial attention network embeds the location information into thechannel attention map to capture the detailed features in the target,and obtains feature representations with two dimensions of channel andspace. Secondly,to reduce the loss of information during the spatial mapping process,a generator is used to reconstruct visual features.Finally,a zero - shot architectural image classification model with common space embeddings is designed to align visual features andsemantic features in subspace and implemented the classification task through nearest neighbor matching. The experimental results showthat compared with the current zero-shot learning method,the proposed method improves the average classification accuracy by 1. 3 and　0. 7 percentage points on the zero-shot dataset CUB and Architecture Style Dataset, respectively.

相似文献/References:

[1]秦牧轩,荆晓远,吴飞.基于公共空间嵌入的端到端深度零样本学习[J].计算机技术与发展,2018,28(11):44.[doi:10.3969/ j. issn.1673-629X.2018.11.010]
　QIN Mu-xuan,JING Xiao-yuan,WU Fei.End-to-end Deep Zero-shot Learning Based on Co-space Embedding[J].,2018,28(10):44.[doi:10.3969/ j. issn.1673-629X.2018.11.010]
[2]刘帅,黄刚,戴晓峰,等.一种融合生成对抗网络的零样本图像分类方法[J].计算机技术与发展,2022,32(07):87.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 015]
　LIU Shuai,HUANG Gang,DAI Xiao-feng,et al.A Zero-shot Classification Based on Generative Adversarial Network[J].,2022,32(10):87.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 015]

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

文章信息/Info

相似文献/References:

常用功能

导航/Navigate

工具/Tools

统计/Statistics