Sequential Learning for Cross-Modal Retrieval.

Ge Song,Xiaoyang Tan
DOI: https://doi.org/10.1109/iccvw.2019.00554
2019-01-01
Abstract:Cross-modal retrieval has attracted increasing attention with the rapid growth of multimodal data, but its learning paradigm under changing environment is less studied. Inspired by the recent achievement in the field of cognition mechanism on how the human brain acquires knowledge, we propose a new sequential learning method for cross-modal retrieval. In this method, a unified model is maintained to capture the common knowledge of various modalities but are learnt in a sequential manner such that it behaves adaptively according to the evolving distribution of different modalities, and needs no laborious alignment operations among multimodal data before learning. Furthermore, we propose a novel meta-learning based method to overcome the catastrophic forgetting encountered in sequential learning. Extensive experiments are conducted on three popular multimodal datasets, showing that our method achieves state-of-the-art cross-modal retrieval performance without any modal-alignment.
What problem does this paper attempt to address?