DSP: Discriminative Spatial Part Modeling for Fine-Grained Visual Categorization
Hantao Yao,Dongming Zhang,Jintao Li,Jianshe Zhou,Shiliang Zhang,Yongdong Zhang
DOI: https://doi.org/10.1016/j.imavis.2017.05.003
IF: 3.86
2017-01-01
Image and Vision Computing
Abstract:Different from the basic-level classification, the Fine-Grained Visual Categorization (FGVC) aims to classify objects belonging to the same species. Therefore, it is more challenging than the basic-level classification. Recently, significant advances have been achieved in FGVC. However, most of the existing methods require bounding boxes or part annotations for training and testing, resulting in limited usability and flexibility. To conquer these limitations, we aim to automatically detect the bounding boxes and parts for FGVC. The bounding boxes are acquired by transferring bounding boxes from training images to testing images. Based on the generated bounding boxes, we employ a multiple-layer Orientational Spatial Part (OSP) model to learn local parts for the object. To achieve more discriminative part modeling, the Discriminative Spatial Part (DSP) model is proposed to select the discriminative parts from OSP. Finally, we employ Convolutional Neural Network (CNN) as the feature extractor and train a linear SVM as the classifier. Extensive experiments on public benchmark datasets manifest the impressive performance of our method, i.e., classification accuracy achieves 79.8% on CUB-200-2011 and 85.7% on Aircraft, which are higher than many existing methods using manual annotations.