Abstract:Fine-grained object retrieval, which aims at finding objects belonging to the same sub-category as the probe object from a large database, is becoming increasingly popular because of its research and application significance. Recently, convolutional neural network (CNN) based deep learning models have achieved promising retrieval performance, as they can learn both feature representations and discriminative distance metrics jointly. Specifically, a generic method is to extract activations of the fully-connected layer as feature descriptors and simultaneously optimize classification constraints (e.g., softmax loss) and similarity constraints (e.g., triplet loss) to improve the representative capability of the features. However, the typical fully-connected layer activations are more focused on representing global attributes of the corresponding image, thus relatively less sensitive to specific local characteristics. Therefore, the features learned through these approaches in general are not sufficiently capable for retrieving fine-grained objects. To attack this issue, we propose an effective feature embedding by simultaneously encoding original global features and discriminative local features, in which the local features are extracted by exploiting strong neural activations on the last convolutional layer. We present that the novel feature embedding can dramatically enlarge the gap between inter-class variance and intra-class variance, which is the key factor to improve retrieval precision. In addition, we show our architecture can also be applied in person re-identification. Experimental results on multiple challenging benchmarks demonstrate that our method outperforms the current state-of-the-art approaches by large margins.

Part-based Fine-Grained Bird Image Retrieval Respecting Species Correlation

Searching by parts: Towards fine-grained image retrieval respecting species correlation*

Fine-Grained Visual Categorization With Fine-Tuned Segmentation

Learning Feature Embedding with Strong Neural Activations for Fine-Grained Retrieval

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition

Selective Parts For Fine-Grained Recognition

Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained Bird Species Categorization

Hierarchical Part Matching for Fine-Grained Visual Categorization

Picking Deep Filter Responses for Fine-Grained Image Recognition

Research on Fine-Grained Image Recognition of Birds Based on Improved YOLOv5

Iterative Object and Part Transfer for Fine-Grained Recognition

Weakly Supervised Fine-Grained Image Recognition Based on Multi-Channel Attention and Object Localization

Real Time Fine-Grained Categorization with Accuracy and Interpretability.

Learning Rich Part Hierarchies with Progressive Attention Networks for Fine-Grained Image Recognition

Large-Scale Fine-Grained Bird Recognition Based on a Triplet Network and Bilinear Model

Three-way Enhanced Part-Aware Network for Fine-Grained Sketch-Based Image Retrieval

Fine-Grained Imag E Categorization by Localizing Tiny Object Parts from Unannotated Images

Cross-Part Learning for Fine-Grained Image Classification

Cross-X Learning for Fine-Grained Visual Categorization

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition

Fine-Grained Image Categorization by Localizing TinyObject Parts from Unannotated Images