Deep Image Retrieval: Indicator and Gram Matrix Weighting for Aggregated Convolutional Features

Zhipeng Wang,Xuanlu Xiang,Zhicheng Zhao,Fei Su
DOI: https://doi.org/10.1109/icme.2018.8486547
2018-07-01
Abstract:Convolutional Neural Network (CNN) has been proven to be an effective feature extractor for multiple computer vision tasks such as image classification and object detection etc. However, image retrieval in realistic scenarios, usually faces large-scale unlabeled datasets, thus the learning of a good model is often infeasible. In this paper, we propose a novel and interpretable image representation via spatial-channel weighting for aggregated deep convolutional features. Specifically, we first determine discriminative regions of an image by computing the Indicator matrix, and then, the distinctive features are extracted from salient areas by calculating the Gram matrix, in which high-order features are learnt. Finally, a compact image representation is generated by fusing spatial saliency and channel sensitivity of CNN features. The experimental results on several benchmark datasets, i.e., Oxford buildings, Paris buildings and Holidays, indicate that the proposed approach outperforms state-of-the-art methods based on pre-trained deep networks.
What problem does this paper attempt to address?