Visual & textual fusion for region retrieval: from both fuzzy matching and bayesian reasoning aspects.

Rongrong Ji,Hongxun Yao
DOI: https://doi.org/10.1145/1290082.1290106
2007-01-01
Abstract:This paper presents a novel visual & textual information fusion framework for region-based image retrieval. We explore the issue of linguistic-integrated region retrieval from both Bayesian Reasoning and Fuzzy Region Matching aspects. Firstly, to associate textual information with image regions, we present a region-based soft annotation strategy. Our method automatically labels each image region with multiple keywords, each of which is assigned a confidence factor to indicate its annotation accuracy. In annotation classifier training, we adopt a pairwise coupling (PWC) SVM bagging network to address the problems of sample insufficiency and sample asymmetry. Consequently, in image retrieval, we fuse regions. visual & textual information to rank image similarities at perceptual level. Two fusion schemes are explored in proposed framework: 1. Semantic-Supervised Integrated Region Matching (SSIRM); 2. Keyword-Integrated Bayesian Reasoning (KIBR). SSIRM is a keyword-integrated fuzzy region matching strategy, which is adopted in the case that the query image is pre-annotated; KIBR is adopted in the case that the query image is non-annotated or poorly-annotated, which supports both query-by-example and query-by-keyword based on statistical text-image translation model. Finally, in relevance feedback (RF) learning, we exploit a unified visual & textual learning algorithm to precisely capture users' retrieval intention. Superior annotation, retrieval (over IRM) and RF performances (Both over IRM + SVM at region-level and SVM & ALSVM & ABSVM at global-level) are presented in our experiments, which demonstrate the efficiency of proposed fusion framework to bridge the semantic gap.
What problem does this paper attempt to address?