ASP-CNN: Aligning Semantic Parts for Fine-Grained Image Classification

Hao Ge,Xiaoguang Tu,Mei Xie,Zheng Ma
DOI: https://doi.org/10.1117/1.jei.28.2.023024
IF: 0.829
2019-01-01
Journal of Electronic Imaging
Abstract:Abstract. Recently, numerous methods have been proposed to tackle the problem of fine-grained image classification (FGIC). Most of them follow a two-step strategy that contains detecting the object regions and classifying with the features extracted from these regions. For the feature extraction, the most popular method is directly cropping the feature maps according to the location of detected part regions. However, one challenge of such a method is that the direction of the semantic parts may vary in different images, therefore, it is necessary to capture such differences for better classification. We propose a CNN architecture by aligning semantic parts (ASP-CNN) for FGIC, aiming to increase the interclass variance and meanwhile reduce the intraclass variance in fine-grained datasets. Extensive experiments on CUB-200-2011 and CUB-200-2010 show the effectiveness of our ASP-CNN.
What problem does this paper attempt to address?