Abstract:Cross-modal Hashing (CMH) retrieval aims to mutually search data from heterogeneous modalities by projecting original modality data into a common hamming space, with the significant advantages of low storage and computing costs. However, CMH remains challenging for multi-label cross-modal datasets. Firstly, preserving content similarity would inevitably be deficient under the representation of short-length binary codes. Secondly, different semantics are treated independently, whereas their co-occurrences are neglected, reducing retrieval quality. Thirdly, the commonly used metric learning objective is ineffective in capturing similarity information at a fine-grained level, leading to the imprecise preservation of such information. Therefore, we propose a Deep Cross-Modal Hashing with Multi-Task Latent Space Learning (DMLSH) framework to tackle these bottlenecks. For a more thorough excavation of distinctive features with diverse characteristics underneath heterogeneous data, our DMLSH is designed to preserve three different types of knowledge. The first is the semantic relevance and co-occurrence with the integration of the attention module and the Long Short-Term Memory (LSTM) layer; The second is the highly precise pairwise correlation considering the quantification of semantic similarity with self-paced optimization; The last is the pairwise similarity information discovered by a self-supervised semantic network from a perspective of probabilistic knowledge transfer. Abundant knowledge from the latent spaces is seamlessly refined and fused into a common Hamming space by a hashing attention mechanism, facilitating the discrimination of hash codes and the elimination of modalities’ heterogeneity. Exhaustive experiments demonstrate the state-of-the-art performance of our proposed DMLSH on four mainstream cross-modal retrieval benchmarks.

Collaborative Subspace Graph Hashing for Cross-modal Retrieval

Discrete Similarity Preserving Hashing for Cross-modal Retrieval.

Semantic Consistency Hashing for Cross-Modal Retrieval

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

Unsupervised Concatenation Hashing Via Combining Subspace Learning and Graph Embedding for Cross-Modal Image Retrieval

Joint Coupled-Hashing Representation for Cross-Modal Retrieval

Graph Convolutional Multi-Label Hashing for Cross-Modal Retrieval

Supervised Intra- and Inter-Modality Similarity Preserving Hashing for Cross-Modal Retrieval.

SCH: Symmetric Consistent Hashing for Cross-Modal Retrieval

Aggregation-Based Graph Convolutional Hashing for Unsupervised Cross-Modal Retrieval

Completely Unpaired Cross-Modal Hashing Based on Coupled Subspace

Asymmetric Supervised Consistent and Specific Hashing for Cross-Modal Retrieval

Deep Cross-Modal Hashing with Multi-Task Latent Space Learning

A High-Dimensional Sparse Hashing Framework for Cross-Modal Retrieval

From Sparse to Dense: Semantic Graph Evolutionary Hashing for Unsupervised Cross-Modal Retrieval.

Online supervised collective matrix factorization hashing for cross-modal retrieval

Unsupervised Multi-modal Hashing for Cross-Modal Retrieval

Hierarchical Consensus Hashing for Cross-Modal Retrieval

Collective Reconstructive Embeddings for Cross-Modal Hashing

Semi-Supervised Graph Convolutional Hashing Network for Large-Scale Cross-Modal Retrieval

Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing