[1]殷梓轩,孙 涵.基于注意力金字塔与监督哈希的细粒度图像检索[J].计算机技术与发展,2023,33(03):20-26.[doi:10. 3969 / j. issn. 1673-629X. 2023. 03. 004]
 YIN Zi-xuan,SUN Han.Fine-grained Image Retrieval Based on Supervised Hashing with Attention Pyramid[J].,2023,33(03):20-26.[doi:10. 3969 / j. issn. 1673-629X. 2023. 03. 004]
点击复制

基于注意力金字塔与监督哈希的细粒度图像检索()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年03期
页码:
20-26
栏目:
媒体计算
出版日期:
2023-03-10

文章信息/Info

Title:
Fine-grained Image Retrieval Based on Supervised Hashing with Attention Pyramid
文章编号:
1673-629X(2023)03-0020-07
作者:
殷梓轩孙 涵
南京航空航天大学 计算机科学与技术学院 / 人工智能学院,江苏 南京 211106
Author(s):
YIN Zi-xuanSUN Han
School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
关键词:
细粒度图像检索注意力金字塔双通路监督哈希稳定分布
Keywords:
fine-grained image retrievalattention pyramiddual pathwaysupervised hashingstable distribution
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 03. 004
摘要:
大规模细粒度图像检索是一项极具挑战性的任务。 由于图像间具有类间距离小、类内距离大的特点,传统的深度神经网络学习到的图像特征存在高度冗余,导致检索速度慢、存储成本高昂。 为解决该问题,提出了一种基于注意力金字塔与监督哈希的深度神经网络模型。 在特征提取网络中,针对细粒度图像的特点, 采用了双通路金字塔结构,并设计了自上而下的特征通路及自下而上的注意力通路,借此更好地融合高层与低层特征。 在分类网络中,为压缩存储空间、提高检索效率,在深度哈希的基础上使用 tanh( x) 代替 sign( x) 作为激活函数,使学习到的哈希函数更容易达到平稳分布;同时结合量化损失与分类损失,使生成的哈希码更好地与原始输入图像的特征匹配。 在 FGVC-Aircraft 及 Stanford Cars 两个标准细粒度数据集上的准确率分别达到 82. 3% 、83. 3% ,均优于其他对比算法,证明了算法的有效性。
Abstract:
Large-scale fine-grained image retrieval is a challenging task. Due to the small inter-class variations and the large intra-classvariations among images, features learned by traditional CNNs is highly redundant, which results in slow query speed and expensivestorage cost. To address this problem,we propose a novel convolutional neural network which combines attention pyramid and supervisedhashing. Specifically,in order to extract finer features,we introduce a dual pathway hierarchy structure in the feature extraction networkwith a top-down feature pathway and a bottom-up attention pathway,which is utilized to combine high-level semantic information andlow-level detailed feature representations. Furthermore,to reduce storage cost and increase query speed,we improve deep hashing byusing tanh( x) instead of sign( x) as the activation function to make sure that the learned hash function achieves stable distribution. At thesame time,we adopt both quantization loss and classification loss to map the binary codes to the origin images better. The experimentalresults demonstrate that the proposed algorithm is superior to other comparison algorithms,for it achieves 82. 3% and 83. 3% accuracy onthe FGVC-Aircraft and the Stanford Cars test set,which proves the effectiveness of the algorithm.

相似文献/References:

[1]范业嘉,孙 涵.基于轻量级深度哈希网络的细粒度图像检索[J].计算机技术与发展,2021,31(10):128.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 022]
 FAN Ye-jia,SUN Han.Fine-grained Image Retrieval Based on Lightweight Deep Hash Network[J].,2021,31(03):128.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 022]
[2]郎文溪,孙 涵.基于视觉一致性增强的细粒度图像检索[J].计算机技术与发展,2022,32(12):12.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 003]
 LANG Wen-xi,SUN Han.Fine-grained Image Retrieval Based on Strengthened Visual Consistency[J].,2022,32(03):12.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 003]

更新日期/Last Update: 2023-03-10