Cross-media Retrieval by Exploiting Fine-Grained Correlation at Entity Level

Lei Huang,Yuxin Peng
DOI: https://doi.org/10.1016/j.neucom.2016.07.067
IF: 6
2016-01-01
Neurocomputing
Abstract:Cross-media retrieval is to submit data of any media type, and get semantically relevant results of different media types. Most existing approaches project low-level features of cross-media data onto a unified feature space. However, some of these feature spaces usually have no explicit semantics, which ignore the intrinsic semantic information contained in the original media content. The others only have coarse-grained semantics suffering from the ambiguity of high-level concepts, because the coarse-grained correlation between low-level features and high-level concepts is simply utilized. Hence, the aforementioned approaches cannot generate the descriptive representation of media content, leading to reduced effectiveness to measure the semantic similarities among cross-media data. To address the above problems, we propose a novel approach to cross-media retrieval by exploiting the fine-grained correlation at the entity level and generating the unified descriptive representation. Concretely, the proposed approach first constructs an entity level with fine-grained semantics between low-level features and high-level concepts. Second, by minimizing (maximizing) the distances between media content with positive (negative) correlation at the entity level, we learn the distance-preserving entity projections (DPEP) and generate the unified descriptive representation of media content. Experimental results on two publicly available datasets demonstrate the effectiveness of our approach.
What problem does this paper attempt to address?