Discriminative Latent Feature Space Learning for Cross-Modal Retrieval

Xu Tang,Cheng Deng,Xinbo Gao
DOI: https://doi.org/10.1145/2671188.2749322
2015-01-01
Abstract:Cross-modal retrieval has drawn much attention in recent years due to its wide applications. Most of existing methods only focus on relevance but overlook heterogeneity and discrimination of features from different modalities, and how to capture and correlate these heterogeneous features is still challenging in this field. Therefore, we propose a general model which jointly learns a discriminative latent feature space for effective cross-modal retrieval. Concretely, a class-specific dictionary is learned to account for each modality, and all resulting sparse codes are simultaneously mapped into a common feature space that describes and associates the cross-modal data. Moreover, label information is leveraged to discriminate different classes inside the intra-modality data and also merge the same class inside the inter-modality data. Cross-modal retrieval is finally performed over the learned common feature space. The experimental results confirmed that our cross-modal method outperforms several competing methods on two public datasets.
What problem does this paper attempt to address?