Learning Class-Specific Pooling Shapes for Image Classification.

Jinzhuo Wang,Wenmin Wang,Ronggang Wang,Wen Gao
DOI: https://doi.org/10.1109/icme.2015.7177433
2015-01-01
Abstract:Spatial pyramid (SP) representation is an extension of bag-of-feature model which embeds spatial layout information of local features by pooling feature codes over pre-defined spatial shapes. However, the uniform style of spatial pooling shapes used in standard SP is an ad-hoc manner without theoretical motivation, thus lacking the generalization power to adapt to different distribution of geometric properties across image classes. In this paper, we propose a data-driven approach to adaptively learn class-specific pooling shapes (CSPS). Specifically, we first establish an over-complete set of spatial shapes providing candidates with more flexible geometric patterns. Then the optimal subset for each class is selected by training a linear classifier with structured sparsity constraint and color distribution cues. To further enhance the robust of our model, the representations over CSPS are compressed according to the shape importance and finally fed to SVM with a multi-shape matching kernel for classification task. Experimental results on three challenging datasets (Caltech-256, Scene-15 and Indoor-67) demonstrate the effectiveness of the proposed method on both object and scene images.
What problem does this paper attempt to address?