Visual and textual fusion for semantically supervised region-based retrieval
Rongrong Ji,Hongxun Yao,Pengfei Xu,Xiaoshuai Sun
DOI: https://doi.org/10.1007/s00530-009-0154-4
IF: 3.9
2009-01-01
Multimedia Systems
Abstract:This paper presents a unified annotation and retrieval framework, which integrates region annotation with image retrieval for performance reinforcement. To integrate semantic annotation with region-based image retrieval, visual and textual fusion is proposed for both soft matching and Bayesian probabilistic formulations. To address sample insufficiency and sample asymmetry in the annotation classifier training phase, we present a region-level multi-label image annotation scheme based on pair-wise coupling support vector machine (SVM) learning. In the retrieval phase, to achieve semantic-level region matching we present a novel retrieval scheme which differs from former work: the query example uploaded by users is automatically annotated online, and the user can judge its annotation quality. Based on the user’s judgment, two novel schemes are deployed for semantic retrieval: (1) if the user judges the photo to be well annotated, Semantically supervised Integrated Region Matching is adopted, which is a keyword-integrated soft region matching method; (2) If the user judges the photo to be poorly annotated, Keyword Integrated Bayesian Reasoning is adopted, which is a natural integration of a Visual Dictionary in online content-based search. In the relevance feedback phase, we conduct both visual and textual learning to capture the user’s retrieval target. Better annotation and retrieval performance than current methods were reported on both COREL 10,000 and Flickr web image database (25,000 images), which demonstrated the effectiveness of our proposed framework.