PartBook for Image Parsing

Kuiyuan Yang,Lei Zhang,Yong Rui,Hong-Jiang Zhang
DOI: https://doi.org/10.1109/cvprw.2012.6239169
2012-01-01
Abstract:Effective image parsing needs a representation that is both selective (to inter-class variations) and invariant (to intra-class variations). CodeBook from bag-of-visual-words representation addresses the invariance, and part-based models can potentially address the selectivity. However, existing part-based approaches either require expensive manual object-level labeling or make strong assumptions not applicable to real-world images. In this paper, we propose a PartBook approach that simultaneously overcomes the above two difficulties. Furthermore, we present an effective framework that integrates CodeBook and PartBook, which achieves both intra-class invariance and inter-class selectivity. Specifically, a set of candidate regions are first selected from heat map-like representations obtained by a SVM classifier trained for each category. Then the regions are clustered based on the dense matching-based similarity, and a part detector is learned from each cluster and further refined by utilizing a latent SVM. The learned PartBook summarizes the most representative mid-level patterns of each category, and can be readily used for image parsing tasks to identify not only objects but also different parts of an object. Extensive experimental results on real-world images show that the automatically learned parts are semantically meaningful, and demonstrate the effectiveness of ParkBook in image parsing tasks at different levels.
What problem does this paper attempt to address?