Fast Unmediated Hashing for Cross-Modal Retrieval.

Xiushan Nie,Xingbo Liu,Xiaoming Xi,Chenglong Li,Yilong Yin
DOI: https://doi.org/10.1109/tcsvt.2020.3042972
IF: 5.859
2020-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Cross-modal hashing is for the purpose of compressing heterogeneous multi-modal data into compact binary codes for the cross-modal retrieval, where accuracy and efficiency are two primary issues. To achieve high accuracy and efficiency, we put forward a novel method named Fast Unmediated Hashing (FUH) for cross-modal retrieval. For this method, motivated by the fact that label vector is a natural binary representation of samples for retrieval, we directly learn the cross-modal hash codes from semantic labels without any intermediate representation. This will capture more relations among different modalities, and reduce the number of variables. However, directly learning hash codes from labels would weaken the discrimination of hash codes. To address this issue, double supervision involving label information and pairwise similarity is proposed to enhance the discrimination. In addition, to decrease the training time, we present a strategy to bypass the similarity matrix-related operation in each iteration of optimization, thus some other related terms can also be computed offline to lower training complexity. Compared to several state-of-the-art techniques on three public datasets, the experimental results have manifested the superiority of FUH concerning efficiency and accuracy.
What problem does this paper attempt to address?