Abstract:With the explosive growth of multimodal data, cross-modal retrieval has drawn increasing research interests. Hashing-based methods have made great advancements in cross-modal retrieval due to the benefits of low storage cost and fast query speed. However, there still exists a crucial challenge to improve the accuracy of cross-modal retrieval due to the heterogeneity gap between modalities. To further tackle this problem, in this paper, we propose a new two-staged cross-modal retrieval method, called Deep Semantic Hashing with Dual Attention (DSHDA). In the first stage of DSHDA, a Semantic Label Network (SeLabNet) is designed to extract label semantic features and hash codes by training the multi-label annotations, which can make the learning of different modalities in a common semantic space and bridge the modality gap effectively. In the second stage of DSHDA, we propose a deep neural network to simultaneously integrate feature and hash code learning for each modality into the same framework, the training of the framework is guided by the label semantic features and hash codes generated from SeLabNet to maximize the cross-modal semantic relevance. Moreover, dual attention mechanisms are used in our neural networks: (1) Lo-attention is used to extract the local key information of each modality and improve the quality of modality features. (2) Co-attention is used to strengthen the relationship between different modalities to produce more consistent and accurate hash codes. Extensive experiments on two real datasets with image-text modalities demonstrate the superiority of the proposed method in cross-modal retrieval tasks.

Semantic-Guided Hashing for Cross-Modal Retrieval

Semantic Consistency Hashing for Cross-Modal Retrieval

Discrete Cross-Modal Hashing for Efficient Multimedia Retrieval

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

Latent semantic-enhanced discrete hashing for cross-modal retrieval

Semantic Decomposition and Enhancement Hashing for Deep Cross-Modal Retrieval

Deep Semantic Hashing with Dual Attention for Cross-Modal Retrieval

Semantic embedding based online cross-modal hashing method

Dual Semantic Fusion Hashing for Multi-Label Cross-Modal Retrieval

Semi-Supervised Semantic-Preserving Hashing For Efficient Cross-Modal Retrieval

Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval

Two-Step Discrete Hashing for Cross-Modal Retrieval

Deep Semantic-Alignment Hashing for Unsupervised Cross-Modal Retrieval

Supervised Intra- and Inter-Modality Similarity Preserving Hashing for Cross-Modal Retrieval.

An efficient dual semantic preserving hashing for cross-modal retrieval

Deep Cross-modal Hashing Based on Semantic Consistent Ranking

Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval

Semantic Constraints Matrix Factorization Hashing for Cross-Modal Retrieval

Cross-modal Hashing with Semantic Deep Embedding

Pseudo-label driven deep hashing for unsupervised cross-modal retrieval

Fast Semantic Preserving Hashing for Large-Scale Cross-Modal Retrieval