Cluster-sensitive Structured Correlation Analysis for Web cross-modal retrieval

Shuhui Wang,Fuzhen Zhuang,Shuqiang Jiang,Qingming Huang,Qi Tian
DOI: https://doi.org/10.1016/j.neucom.2015.05.049
IF: 6
2015-01-01
Neurocomputing
Abstract:Modern cross-modal retrieving technology is required to find semantically relevant content from heterogeneous modalities. As previous studies construct unified dense correlation models on small scale cross-modal data, they are not capable of processing large scale Web data, because (a) the content of Web cross media is divergent; (b) the topic sensitive structure information in the high dimensional space is neglected; and (c) data should be organized as strictly corresponding pairs, which is not satisfied in real world scenarios. To address these challenges, we propose a cluster-sensitive cross-modal correlation learning framework. First, a set of cluster-sensitive correlation sub-models are learned instead of a unified correlation model, which better fits the content divergence in different modalities. We impose structured sparsity regularization on the projection vectors to learn a set of interpretable structured sparse correlation sub-models. Second, to compensate for the correspondence missing, we take full advantage of both intra-modal affinity and inter-modal co-occurrence. The projected coordinates of adjacent data within a modality tend to be similar, and the inconsistency of cluster-sensitive projection is minimized. The learned correlation model adapts to the content divergence and thus achieves better model generality and bias-variance trade-off. Extensive experiments on two large scale cross-modal data demonstrate the effectiveness of our approach.
What problem does this paper attempt to address?