Abstract:Significant progress has been made in instance image retrieval based on deep feature aggregation. However, existing approaches are limited by two issues: 1) The inability of deep features to localize target objects generates inaccurate feature descriptions and 2) using short vector feature representations provides unsatisfactory retrieval performance. To address these issues, we propose the compact deep semantic correlation descriptor (DSCD) approach, which has three main highlights: (1) Unlike manual labeling approaches, we propose a channel semantic correlation learning method to localize target objects. This method can learn semantic correlation information between channels from an auxiliary dataset and use this information as a priori knowledge to locate target objects well. (2) We propose a hierarchical attention mechanism to integrate multi-level features within target regions. It utilizes object and focus attention modules to capture global and local features, and constructs channel attention coefficients to enhance channels containing important target features. (3) We propose a general, yet efficient dimensionality reduction method named adaptive PCA-whitening to improve short vector feature retrieval performance. This method can dynamically fuse the feature vectors of different dimensions via adaptive coefficients, yielding a more compact and robust representation. Extensive experiments on six benchmark datasets show that our method achieves better retrieval performance compared to existing state-of-the-art unsupervised methods. Compared with the next-best unsupervised methods, the mAP scores of our method (dim = 128) are 11.2 %, 6.1 %, 9.3 %, 8.3 %, 14.3 %, and 22.1 % higher on the Oxford5K, Paris6K, Oxford105K, Paris106K, ROxford, and RParis datasets, respectively.

Deep Image Retrieval: Indicator and Gram Matrix Weighting for Aggregated Convolutional Features

Aggregating Hierarchical Binary Activations for Image Retrieval

Exploring Geometric Information in CNN for Image Retrieval.

Learning Disentangled Representation for Cross-Modal Retrieval with Deep Mutual Information Estimation.

Image retrieval based on aggregated deep features weighted by regional significance and channel sensitivity

Learning Feature Embedding with Strong Neural Activations for Fine-Grained Retrieval

Deep Aggregation of Regional Convolutional Activations for Content Based Image Retrieval

Deep Learning Using Isotroping, Laplacing, Eigenvalues Interpolative Binding, and Convolved Determinants with Normed Mapping for Large-Scale Image Retrieval

Learning Token-Based Representation for Image Retrieval

Deep Learning for Content-Based Image Retrieval: A Comprehensive Study

Image retrieval by aggregating deep orientation structure features

Deep Graph Convolutional Quantization Networks for Image Retrieval.

Unsupervised Part-based Weighting Aggregation of Deep Convolutional Features for Image Retrieval

Seeing the Big Picture: Deep Embedding with Contextual Evidences

Two-stage Pooling of Deep Convolutional Features for Image Retrieval.

Indicative Image Retrieval: Turning Blackbox Learning into Grey

Collaborative Index Embedding for Image Retrieval

Deep-seated features histogram: A novel image retrieval method

Image retrieval using compact deep semantic correlation descriptors

A Feature Learning Approach For Image Retrieval

Towards Better Analysis of Deep Convolutional Neural Networks