Abstract:Hashing-based cross-modal retrieval methods have become increasingly popular due to their advantages in storage and speed. While current methods have demonstrated impressive results, there are still several issues that have not been addressed. Specifically, many of these approaches assume that labels are perfectly assigned, despite the fact that in real-world scenarios, labels are often incomplete or partially missing. There are two reasons for this, as manual labeling can be a complex and time-consuming task, and annotators may only be interested in certain objects. As such, cross-modal retrieval with missing labels is a significant challenge that requires further attention. Moreover, the similarity between labels is frequently ignored, which is important for exploring the high-level semantics of labels. To address these limitations, we propose a novel method called Cross-Modal Hashing with Missing Labels (CMHML). Our method consists of several key components. First, we introduce Reliable Label Learning to preserve reliable information from the observed labels. Next, to infer the uncertain part of the predicted labels, we decompose the predicted labels into latent representations of labels and samples. The representation of samples is extracted from different modalities, which assists in inferring missing labels. We also propose Label Correlation Preservation to enhance the similarity between latent representations of labels. Hash codes are then learned from the representation of samples through Global Approximation Learning. We also construct a similarity matrix according to predicted labels and embed it into hash codes learning to explore the value of labels. Finally, we train linear classifiers to map original samples to a low-dimensional Hamming space. To evaluate the efficacy of CMHML, we conduct extensive experiments on four publicly available datasets. Our method is compared to other state-of-the-art methods, and the results demonstrate that our model performs competitively even when most labels are missing.

Large-Scale Cross-Modal Hashing with Unified Learning and Multi-Object Regional Correlation Reasoning

Discrete Cross-Modal Hashing for Efficient Multimedia Retrieval

Nonlinear Discrete Cross-Modal Hashing for Visual-Textual Data

Multi-Relational Deep Hashing for Cross-Modal Search

Deep Cross-Modal Hashing With Hashing Functions and Unified Hash Codes Jointly Learning

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval.

Correlation Hashing Network for Efficient Cross-Modal Retrieval.

Deep Class-guided Hashing for Multi-label Cross-modal Retrieval

Cross-modal hashing with missing labels

Graph Convolutional Multi-Label Hashing for Cross-Modal Retrieval

Sequential Cross-Modal Hashing Learning Via Multi-scale Correlation Mining

Deep Unified Cross-Modality Hashing by Pairwise Data Alignment

Unsupervised Deep Hashing Via Binary Latent Factor Models for Large-scale Cross-modal Retrieval

Robust Unsupervised Cross-modal Hashing for Multimedia Retrieval

Unsupervised Multi-modal Hashing for Cross-Modal Retrieval

Deep Multimodal Hashing with Orthogonal Regularization

Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing

Work Together: Correlation-Identity Reconstruction Hashing for Unsupervised Cross-modal Retrieval

Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization