Combining Context, Consistency, and Diversity Cues for Interactive Image Categorization

Zhiwu Lu,Horace H. S. Ip
DOI: https://doi.org/10.1109/tmm.2010.2041100
IF: 7.3
2010-01-01
IEEE Transactions on Multimedia
Abstract:This paper presents a novel graph-based framework which can combine context, consistency, and diversity cues for interactive image categorization. The image representation is first formed with visual keywords by dividing images into blocks and then performing clustering on these blocks. The context across visual keywords within an image is further captured by proposing a 2-D spatial Markov chain model. To develop a graph-based approach to image categorization, we incorporate intra-image context into a new class of kernel called spatial Markov kernel which can be used to define the affinity matrix for a graph. After graph construction with this kernel, the large unlabeled data can be exploited by graph-based semi-supervised learning through label propagation with inter-image consistency. For interactive image categorization, we further combine this semi-supervised learning with active learning by defining a new diversity-based data selection criterion using spectral embedding. Experiments then demonstrate that the proposed framework can achieve promising results.
What problem does this paper attempt to address?