Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval

Yue-Ting Zhuang,Yi Yang,Fei Wu
DOI: https://doi.org/10.1109/TMM.2007.911822
IF: 7.3
2008-01-01
IEEE Transactions on Multimedia
Abstract:Although multimedia objects such as images, audios and texts are of different modalities, there are a great amount of semantic correlations among them. In this paper, we propose a method of transductive learning to mine the semantic correlations among media objects of different modalities so that to achieve the cross-media retrieval. Cross-media retrieval is a new kind of searching technology by which the query examples and the returned results can be of different modalities, e.g., to query images by an example of audio. First, according to the media objects features and their co-existence information, we construct a uniform cross-media correlation graph, in which media objects of different modalities are represented uniformly. To perform the cross-media retrieval, a positive score is assigned to the query example; the score spreads along the graph and media objects of target modality or MMDs with the highest scores are returned. To boost the retrieval performance, we also propose different approaches of long-term and short-term relevance feedback to mine the information contained in the positive and negative examples.
What problem does this paper attempt to address?