Enhanced Isomorphic Semantic Representation For Cross-Media Retrieval

Ting Liu,Yao Zhao,Shikui Wei,Yunchao Wei,Lixin Liao
DOI: https://doi.org/10.1109/ICME.2017.8019356
2017-01-01
Abstract:Nowadays cross-media retrieval is an useful technology that helps people find expected information from the huge amount of multimodal data more efficiently. A common cross-media retrieval framework is first to map features of different modalities into an isomorphic semantic space so that the similarity between heterogeneous data can be measured. For most of semantic space based methods, the mapping mechanism from original to semantic space of each modality is optimized independently, yet the more discriminative characteristic of a certain modality is not taken into account. In this paper, we propose a deep framework which introduces a latent embedding layer to learn joint parameters to obtain semantically meaningful representations of images and texts. Specifically, the discriminative characteristic embedded in the textual modality can be transferred to images through the latent embedding layer and joint parameters to enhance the consistency between semantic representations. Extensive experiments on the three popular publicly available datasets well demonstrate the superiority of the proposed method, which achieves the new stateof-the-arts.
What problem does this paper attempt to address?