Abstract:Fine-grained object retrieval, which aims at finding objects belonging to the same sub-category as the probe object from a large database, is becoming increasingly popular because of its research and application significance. Recently, convolutional neural network (CNN) based deep learning models have achieved promising retrieval performance, as they can learn both feature representations and discriminative distance metrics jointly. Specifically, a generic method is to extract activations of the fully-connected layer as feature descriptors and simultaneously optimize classification constraints (e.g., softmax loss) and similarity constraints (e.g., triplet loss) to improve the representative capability of the features. However, the typical fully-connected layer activations are more focused on representing global attributes of the corresponding image, thus relatively less sensitive to specific local characteristics. Therefore, the features learned through these approaches in general are not sufficiently capable for retrieving fine-grained objects. To attack this issue, we propose an effective feature embedding by simultaneously encoding original global features and discriminative local features, in which the local features are extracted by exploiting strong neural activations on the last convolutional layer. We present that the novel feature embedding can dramatically enlarge the gap between inter-class variance and intra-class variance, which is the key factor to improve retrieval precision. In addition, we show our architecture can also be applied in person re-identification. Experimental results on multiple challenging benchmarks demonstrate that our method outperforms the current state-of-the-art approaches by large margins.

Delving into Fully Convolutional Networks Activations for Visual Recognition.

Aggregating Hierarchical Binary Activations for Image Retrieval

Exploiting Hierarchical Activations of Neural Network for Image Retrieval.

Learning Feature Embedding with Strong Neural Activations for Fine-Grained Retrieval

Advances in Convolutional Neural Networks

Cross-convolutional-layer Pooling for Generic Visual Recognition.

The Treasure beneath Convolutional Layers: Cross-convolutional-layer Pooling for Image Classification

The Treasure Beneath Convolutional Layers: Cross-Convolutional-Iayer Pooling For Image Classification

Convolutional Networks with Cross-Layer Neurons for Image Recognition

Fully Convolutional Attention Networks for Fine-Grained Recognition

On the Large-Scale Transferability of Convolutional Neural Networks.

Feature Extraction and Image Recognition with Convolutional Neural Networks

Return of the Devil in the Details: Delving Deep into Convolutional Nets

Fully Convolutional Neural Networks With Full-Scale-Features For Semantic Segmentation

Evolving Convolutional Neural Network And Its Application In Fine-Grained Visual Categorization

Towards Better Analysis of Deep Convolutional Neural Networks

Transform-Invariant Convolutional Neural Networks for Image Classification and Search

Convolutional Channel Features

Visualizing and Comparing Convolutional Neural Networks

Fully Convolutional Networks for Semantic Segmentation

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition