Progressive Cross-Media Correlation Learning.

Xin Huang,Yuxin Peng
DOI: https://doi.org/10.1007/978-981-13-1702-6_20
2018-01-01
Abstract:Cross-media retrieval aims to retrieve across different media types, such as image and text, whose key problem is to learn cross-media correlation from known training data. Existing methods indiscriminately take all data for model training, ignoring that there exist hard samples which lead to misleading and even noisy information, bringing negative effect especially in the early period of model training. Because cross-media training data is difficult to collect, the common challenge of small-scale training data makes this problem even severer to limit the robustness and accuracy of cross-media retrieval. For addressing the above problem, this paper proposes Progressive Cross-media Correlation Learning (PCCL) approach, which takes a large-scale cross-media dataset with general knowledge (reference data), to guide the correlation learning on another small-scale dataset (target data) via the progressive sample selection mechanism. Specifically, we first pre-train a hierarchical correlation learning network on reference data as reference model, which is used to assign samples in target data with different learning difficulties, via intra-media and inter-media relevance significance metric. Then, training samples in target data are selected with gradually ascending learning difficulties, so that the correlation learning process can progressively reduce the “heterogeneity gap” to enhance the model robustness and improve retrieval accuracy. We take our self-constructed large-scale XMediaNet dataset as the reference data, and the cross-media retrieval experiments on 2 widely-used datasets show PCCL outperforms 9 state-of-the-art methods.
What problem does this paper attempt to address?