Label-Based Deep Semantic Hashing for Cross-Modal Retrieval

Weiwei Weng,Jiagao Wu,Lu Yang,Linfeng Liu,Bin Hu
DOI: https://doi.org/10.1007/978-3-030-36718-3_3
2019-01-01
Abstract:With the arrival of the era of big data, multimodal data increases explosively and the cross-modal retrieval has drawn increasing research interests. Due to benefits of low storage cost and fast query speed, hashing-based methods have made great advancements in cross-modal retrieval. Most of the previous hashing methods design a similarity-preserving matrix based on labels to simply describe binary similarity relationship between multimodal data, i.e., similar or dissimilar. This method is applicable to single-label data, but it fails to make use of labels to explore rich semantic information for multi-label data. In this paper, we propose a new cross-modal retrieval method, called Label-Based Deep Semantic Hashing (LDSH). In this method, a new similarity-preserving matrix is given according to multi-label to describe the degree of similarity between multimodal data. Moreover, the last fully connected layer of the deep neural network is designed as a Block Structure (B-Structure) to reduce the redundancy between generated bits. In order to accelerate the convergence speed of neural network, the Batch Normalization Layer (BN-Layer) is adopted after the B-Structure. Extensive experiments on two real datasets with image-text modalities demonstrate the superiority of the proposed method in cross-modal retrieval tasks.
What problem does this paper attempt to address?