Fine-grained Image Classification Combined with Label Description

Xiruo Shi,Liutong Xu,Pengfei Wang
DOI: https://doi.org/10.1109/ictai.2019.00148
2019-01-01
Abstract:Fine-grained image classification faces huge challenges because fine-grained images are similar overall, and the distinguishable regions are difficult to find. Generally, in this task, label descriptions contain valuable semantic information that is accurately compatible with discriminative features of images (i.e., the description of the "Rusty Black Bird" corresponding to the morphological characteristics of its image). Bringing these descriptions into consideration is benefit to discern these similar images. Previous works, however, usually ignore label descriptions and just mine informative features from images, thus the performance may be limited. In this paper, we try to take both label descriptions and images into consideration, and we formalize the classification task into a matching task to address this issue. Specifically, Our model is based on a combination of Convolutional Neural Networks (CNN) over images and Graph Convolutional Networks(GCN) over label descriptions. We map the resulting image representations and text representations to the same dimension for matching and achieve the purpose of classification through the matching operation. Experimental results demonstrate that our approach can achieve the best performance compared with the state-of-the-art methods on the datasets of Stanford dogs and CUB-200-2011.
What problem does this paper attempt to address?