Abstract:Fine-grained visual categorization (FGVC) refers to assigning fine-grained labels to images which belong to the same base category. Due to the high inter-class similarity, it is challenging to distinguish fine-grained images under different subcategories. Recently, researchers have proposed to firstly localize key object parts within images and then find discriminative clues on object parts. To localize object parts, existing methods train detectors for different kinds of object parts. However, due to the fact that the same kind of object part in different images often changes intensely in appearance, the existing methods face two shortages: 1) Training part detector for object parts with diverse appearance is laborious; 2) Discriminative parts with unusual appearance may be neglected by the trained part detectors. To localize the key object parts efficiently and accurately, a novel FGVC method is proposed in the paper. The main novelty is that the proposed method localizes the key object parts within each image only depending on a single image and hence avoid the influence of diversity between parts in different images. The proposed FGVC method consists of two key steps. Firstly, the proposed method localizes the key parts in each image independently. To this end, potential object parts in each image are identified and then these potential parts are merged to generate the final representative object parts. Secondly, two kinds of features are extracted for simultaneously describing the discriminative clues within each part and the relationship between object parts. In addition, a part based dropout learning technique is adopted to boost the classification performance further in the paper. The proposed method is evaluated in comparison experiments and the experiment results show that the proposed method can achieve comparable or better performance than state-of-the-art methods.

Learning Mutually Exclusive Part Representations for Fine-Grained Image Classification

Fine-Grained Visual Categorization With Fine-Tuned Segmentation

Learning Enhanced Features and Inferring Twice for Fine-Grained Image Classification

Multi-discriminative Parts Mining for Fine-Grained Visual Classification

Feature Re-Attention and Multi-Layer Feature Fusion for Fine-Grained Visual Classification

Fine-Grained Image Classification with Object-Part Model

Feature Boosting, Suppression, and Diversification for Fine-Grained Visual Classification.

Cross-Part Learning for Fine-Grained Image Classification

Learning Semantically Enhanced Feature for Fine-Grained Image Classification

Fine-Grained Visual Categorization by Localizing Object Parts With Single Image

Learning Hierarchal Channel Attention for Fine-grained Visual Classification.

Fine-Grained Image Classification Via Combining Vision And Language

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition

Fine-Grained Image Retrieval Via Multiple Part-Level Feature Ensemble

Learning Two-level Features for Fine-grained Image Classification

Fine-Grained Visual Classification Via Simultaneously Learning of Multi-regional Multi-grained Features

Channel Boosting, Cross-Layer Feature Integration, and Multi-Scale Classification for Fine-Grained Visual Classification

Object-Part Attention Model for Fine-Grained Image Classification.

Weakly Supervised Fine-Grained Image Recognition Based on Multi-Channel Attention and Object Localization

Selecting Discriminative Features for Fine-Grained Visual Classification

Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification from the Bottom Up