Max-margin Analysis Based Patch Sampling for Discovery of Mid-Level Parts

Lingxiao Yang,Xiaohua Xie
DOI: https://doi.org/10.1109/icip.2015.7351194
2015-01-01
Abstract:Discovering representative, discriminative mid-level parts is crucial for visual recognition models such as Bag-Of-Parts. We present a weakly-supervised approach to learn class-specific mid-level parts from a database. In our approach, only the image-level labels but no additional human annotations are used. As a start, we employ a SVM-like model to sample discriminant patches from each image. The employed SVM-like model corresponds to a max-margin analysis between a specific image patch and other patches from the whole training set, which can be easily solved in a closed form. For each class, the sampled patches are then clustered in an agglomerative manner to generate the final semantic parts, in the meantime the less-representative patches are discarded. The proposed approach is effective since it sequentially discards the non-discriminative and non-representative patches. The approach is also efficient since the clustering operation only needs to handle a small number of discriminant patches. The state-of-the-art results are observed in scene classification benchmarks when using the learned parts as a visual codebook.
What problem does this paper attempt to address?