Spatial Pooling for Transformation Invariant Image Representation

Xia Li,Yan Song,Yijuan Lu,Qi Tian
DOI: https://doi.org/10.1145/2072298.2072052
2011-01-01
Abstract:Spatial Pyramid Matching (SPM) [2] has been proposed to extend the Bag-of-Word (BoW) model for object classification. By re-serving the finer level information, it makes image matching more accurate. However, for not well-aligned images, where the object is rotated, flipped or translated, SPM may lose its discrimination power. To tackle this problem, we propose novel spatial pooling layouts to address various transformations, and generate a more general image representation. To evaluate the effectiveness of the proposed approach, we conduct extensive experiments on three transformation emphasized datasets for object classification task. Experimental results demonstrate its superiority over the state-of-the-arts. Besides, the proposed image representation is compact and consistent with the BoW model, which makes it applicable to image retrieval task as well.
What problem does this paper attempt to address?