Cross-Media Retrieval Method Based on Content Correlations

张鸿,吴飞,庄越挺,陈建勋
DOI: https://doi.org/10.3724/sp.j.1016.2008.00820
2009-01-01
Chinese Journal of Computers
Abstract:Most traditional content-based multimedia retrieval methods are designed for multimedia data of single modality.Such methods include image retrieval,audio retrieval,video retrieval,etc.This paper proposes a novel cross-media retrieval approach,which can process multimedia data of different modalities and measure cross-media similarity,such as image-audio similarity.First statistical method is used to learn canonical correlations between low-level feature spaces of different modalities.Then,sub-space mapping is designed to build an isomorphic subspace and solve the heterogeneity problem between different low-level feature vectors.This subspace contains media objects of different modalities,and each media object is represented with isomorphic vector.Since canonical correlations among multimedia objects are furthest preserved during the mapping process,cross-media similarity can be estimated with defined distance metric.Furthermore,relevance feedback provided by users is utilized to learn prior knowledge and refine multimedia topology in the subspace.In this way cross-media similarity is more consistent with human perception with the incorporation of user interaction.Both image and audio data are selected for experiments and comparisons.Given the same visual and auditory features the new approach outperforms ICA,PCA and PLS methods both in precision and recall performance.Overall cross-media retrieval results between images and audios are very encouraging.
What problem does this paper attempt to address?