Modality-specific Adaptive Scaling Method for Cross-modal Retrieval

Baitao Chen,Xiao Ke
DOI: https://doi.org/10.1109/icicml57342.2022.10009863
2022-01-01
Abstract:There are huge differences in data distribution and feature representation of different modalities. How to flexibly and accurately retrieve data from different modalities is a challenging problem. The mainstream common subspace method only focus on the heterogeneity gap between modalities, and use a unified method to jointly learn the common representation of different modalities, which can easily lead to the difficulty of multi-modal unified fitting. In this work, we innovatively propose the concept of multi-modal information density discrepancy, and propose a modality-specific adaptive scaling method incorporating prior knowledge, which can adaptively learn the most suitable network for different modalities. Comprehensive experimental results on three widely used cross-modal retrieval datasets show the proposed MASM achieves the state-of-the-art results and significantly outperforms other existing methods.
What problem does this paper attempt to address?