Abstract:Hashing technology has exhibited great cross-modal retrieval potential due to its appealing retrieval efficiency and storage effectiveness. Most current supervised cross-modal retrieval methods heavily rely on accurate semantic supervision, which is intractable for annotations with ever-growing sample sizes. By comparison, the existing unsupervised methods rely on accurate sample similarity preservation strategies with intensive computational costs to compensate for the lack of semantic guidance, which causes these methods to lose the power to bridge the semantic gap. Furthermore, both kinds of approaches need to search for the nearest samples among all samples in a large search space, whose process is laborious. To address these issues, this paper proposes an unsupervised dual deep hashing (UDDH) method with semantic-index and content-code for cross-modal retrieval. Deep hashing networks are utilized to extract deep features and jointly encode the dual hashing codes in a collaborative manner with a common semantic index and modality content codes to simultaneously bridge the semantic and heterogeneous gaps for cross-modal retrieval. The dual deep hashing architecture, comprising the head code on semantic index and tail codes on modality content, enhances the efficiency for cross-modal retrieval. A query sample only needs to search for the retrieved samples with the same semantic index, thus greatly shrinking the search space and achieving superior retrieval efficiency. UDDH integrates the learning processes of deep feature extraction, binary optimization, common semantic index, and modality content code within a unified model, allowing for collaborative optimization to enhance the overall performance. Extensive experiments are conducted to demonstrate the retrieval superiority of the proposed approach over the state-of-the-art baselines.

Unsupervised Deep Cross-modal Hashing with Virtual Label Regression

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

Discrete Cross-Modal Hashing for Efficient Multimedia Retrieval

Nonlinear Discrete Cross-Modal Hashing for Visual-Textual Data

Unsupervised Deep Hashing Via Binary Latent Factor Models for Large-scale Cross-modal Retrieval

Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval

Unsupervised Multi-modal Hashing for Cross-Modal Retrieval

Deep Discrete Cross-Modal Hashing with Multiple Supervision

Deep Cross-Modal Hashing With Hashing Functions and Unified Hash Codes Jointly Learning

Pseudo-label driven deep hashing for unsupervised cross-modal retrieval

Dictionary Learning Based Supervised Discrete Hashing for Cross-Media Retrieval

Deep Cross-modal Hashing via Margin-dynamic-softmax Loss

Dual Variational Network for Unsupervised Cross-Modal Hashing

Unsupervised Deep Fusion Cross-modal Hashing

Discriminant Cross-modal Hashing

Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval

Unsupervised Dual Deep Hashing with Semantic-Index and Content-Code for Cross-Modal Retrieval

Deep Semantic Correlation Learning Based Hashing for Multimedia Cross-Modal Retrieval

Unsupervised Deep Imputed Hashing for Partial Cross-modal Retrieval

Efficient Online Label Consistent Hashing for Large-Scale Cross-Modal Retrieval.

Deep Multi-Similarity Hashing Via Label-Guided Network for Cross-Modal Retrieval