Abstract:Due to the high retrieval efficiency and low storage cost for cross-modal search tasks, cross-modal hashing methods have attracted considerable attention from the researchers. For the supervised cross-modal hashing methods, how to make the learned hash codes sufficiently preserve semantic information contained in the label of datapoints is the key to further enhance the retrieval performance. Hence, almost all supervised cross-modal hashing methods usually depend on defining similarities between datapoints with the label information to guide the hashing model learning fully or partly. However, the defined similarity between datapoints can only capture the label information of datapoints partially and misses abundant semantic information, which then hinders the further improvement of retrieval performance. Thus, in this paper, different from previous works, we propose a novel cross-modal hashing method without defining the similarity between datapoints, called Deep Cross-modal Proxy Hashing (DCPH). Specifically, DCPH first trains a proxy hashing network to transform each category information of a dataset into a semantic discriminative hash code, called proxy hash code. Each proxy hash code can preserve the semantic information of its corresponding category well. Next, without defining the similarity between datapoints to supervise the training process of the modality-specific hashing networks, we propose a novel margin-dynamic-softmax loss to directly utilize the proxy hashing codes as supervised information. Finally, by minimizing the novel margin-dynamic-softmax loss, the modality-specific hashing networks can be trained to generate hash codes that can simultaneously preserve the cross-modal similarity and abundant semantic information well. Extensive experiments on three benchmark datasets show that the proposed method outperforms the state-of-the-art baselines in the cross-modal retrieval tasks.

Cross-modal retrieval based on shared proxies

Semantic Consistency Hashing for Cross-Modal Retrieval

X-Gacmn: An X-Shaped Generative Adversarial Cross-Modal Network With Hypersphere Embedding

Frustratingly Easy Cross-Modal Hashing

Learning Disentangled Representation for Cross-Modal Retrieval with Deep Mutual Information Estimation.

Learning Explicit and Implicit Latent Common Spaces for Audio-Visual Cross-Modal Retrieval

Deep Neighborhood-aware Proxy Hashing with Uniform Distribution Constraint for Cross-modal Retrieval

Deep Cross-modal Proxy Hashing

Adversarial Cross-Modal Retrieval via Learning and Transferring Single-Modal Similarities

Deep Supervised Cross-Modal Retrieval

Adversarial Cross-Modal Retrieval

Joint Dictionary Learning and Semantic Constrained Latent Subspace Projection for Cross-Modal Retrieval.

Rethinking Label-Wise Cross-Modal Retrieval from A Semantic Sharing Perspective

Cross-Modal Coordination Across a Diverse Set of Input Modalities

Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval

Graph Embedding Learning for Cross-Modal Information Retrieval.

Multicenter clinical trial of implanted norethindrone pellets for long-acting contraception in women. Program for Applied Research on Fertility Regulation.

Learning Discriminative Representations for Semantic Cross Media Retrieval

Adversarial Learning-Based Semantic Correlation Representation for Cross-Modal Retrieval

Deep Multi-Graph Hierarchical Enhanced Semantic Representation for Cross-Modal Retrieval

Discriminative Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval.