Semantic Illustration Retrieval for Very Large Data Set

Song Kai,Hu Tie-jun,Tian Yong-hong
2006-01-01
Abstract:In this paper, we present a retrieval system that performs the illustration retrieval on very large data set. The traditional text-based retrieval systems often perform poorly on the illustration retrieval, because some illustrations are uncaptioned. Even worse, textual information are often mixed with noisy information and therefore fail to represent the illustrations accurately. To overcome the problem, we propose a semantic model for illustration retrieval. In this model, we first extract the shape information whose similarities are then used to construct two link graphs. Based on the graphs, we execute the auto-captioning procedure on the uncaptioned illustrations. In addition, cross-modal analysis is applied to get rid of the noisy information and reduce the dimensionality of the feature vectors. Finally, we introduce a re-rank scheme that returns as many subtopics related to the query as possible along with the improvement in relevance. Experiments on approximately 500,000 illustrations showed that our system performs efficiently in retrieving the illustrations with high relevance and diversity. Index Terms — illustration retrieval, feature extraction, cross-modal analysis, ranking
What problem does this paper attempt to address?