Abstract:Multi-modal hashing can encode the large-scale social geo-media multimedia data from multiple sources into a common discrete hash space, in which the heterogeneous correlations from multiple modalities could be well explored and preserved into the objective semantic-consistent hash codes. The current researches on multi-modal hashing mainly focus on performing common data reconstruction, but they fail to effectively distill the intrinsic and consensus structures of multi-modal data and fully exploit the inherent semantic knowledge to capture semantic-consistent information across multiple modalities, leading to unsatisfactory retrieval performance. To facilitate this problem and develop an efficient multi-modal geographical retrieval method, in this article, we propose a discriminative multi-modal hashing framework named Cognitive Multi-modal Consistent Hashing (CMCH), which can progressively pursue the structure consensus over heterogeneous multi-modal data and simultaneously explore the informative transformed semantics. Specifically, we construct a parameter-free collaborative multi-modal fusion module to incorporate and excavate the underlying common components from multi-source data. Particularly, our formulation seeks for a joint multi-modal compatibility among multiple modalities under a self-adaptive weighting manner, which can take full advantages of their complementary properties. Moreover, a cognitive self-paced learning policy is further leveraged to conduct progressive feature aggregation, which can coalesce multi-modal data onto the established common latent space in a curriculum learning mode. Furthermore, deep semantic transform learning is elaborated to generate flexible semantics for interactively guiding collaborative hash codes learning. An efficient discrete learning algorithm is devised to address the resulting optimization problem, which obtains stable solutions when dealing with large-scale multi-modal retrieval tasks. Sufficient experiments performed on four large-scale multi-modal datasets demonstrate the encouraging performance of the proposed CMCH method in comparison with the state-of-the-arts over multi-modal information retrieval and computational efficiency. The source codes of this work could be available at https://github.com/JunfengAn1998a/CMCH .

FedCAFE: Federated Cross-Modal Hashing with Adaptive Feature Enhancement

Discrete Cross-Modal Hashing for Efficient Multimedia Retrieval

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

Fast Discrete Collaborative Multi-Modal Hashing for Large-Scale Multimedia Retrieval

Cross-Domain Federated Data Modeling on Non-IID Data

Cross-Modal Hash Method Based on Multi-Scale Fusion and Projection Matching Constraint

Asymmetric Supervised Consistent and Specific Hashing for Cross-Modal Retrieval

Cross-modal retrieval based on multi-dimensional feature fusion hashing

Extensible Cross-Modal Hashing.

Cognitive multi-modal consistent hashing with flexible semantic transformation

Privacy-Enhanced Prototype-based Federated Cross-modal Hashing for Cross-modal Retrieval

Adaptive Marginalized Semantic Hashing for Unpaired Cross-Modal Retrieval

Deep Adaptively-Enhanced Hashing With Discriminative Similarity Guidance for Unsupervised Cross-Modal Retrieval

MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval

Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal Retrieval

Fast Cross-Modal Hashing With Global and Local Similarity Embedding

Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing

FedHAP: Federated Hashing with Global Prototypes for Cross-silo Retrieval

Deep Cross-Modal Hashing with Fine-Grained Similarity

Deep Medical Cross-Modal Attention Hashing.

A High-Dimensional Sparse Hashing Framework for Cross-Modal Retrieval