Similarity and Diversity Induced Paired Projection for Cross-Modal Retrieval

Jinxing Li,Mu Li,Guangming Lu,Bob Zhang,Hongpeng Yin,David Zhang
DOI: https://doi.org/10.1016/j.ins.2020.06.032
IF: 8.1
2020-01-01
Information Sciences
Abstract:The heterogeneous gap among cross modalities is a critical problem in many applications (e.g., retrieval). Considering that the main purpose of cross-modal learning is to learn a common representation while there also exist specific components across different modalities, a similarity and diversity induced paired projection (SDPP) method is proposed in this paper. SDPP not only extracts the correlation in a common subspace, but also removes the view-specific information which does not contribute to our task. In order to model the specific components, the Hilbert Schmidt Independence Criterion (HSIC) is introduced as a co-regularization to explicitly enforce the diversity. Additionally, different from some existing subspace learning methods which are time consuming in the testing phase, a paired projection strategy is exploited, being capable of obtaining the similar information in a simple but effective way. To optimize the presented approach, an efficient algorithm is designed to update different variables alternatively. Finally, we apply our strategy to the cross-modal retrieval, and experimental results on several real-world datasets substantiate the effectiveness and superiority of our model compared with other state-of-the-art methods.
What problem does this paper attempt to address?