Abstract:Cross-modal hashing aims to map heterogeneous multimedia data into a common Hamming space through hash function, and achieves fast and flexible cross-modal retrieval. Most existing cross-modal hashing methods learn hash function by mining the correlation among multimedia data, but ignore the important property of multimedia data: Each modality of multimedia data has features of different scales, such as texture, object, and scene features in the image, which can provide complementary information for boosting retrieval task. The correlations among the multi-scale features are more abundant than the correlations between single features of multimedia data, which reveal finer underlying structures of the multimedia data and can be used for effective hashing function learning. Therefore, we propose the Multi-scale Correlation Sequential Cross-modal Hashing (MCSCH) approach, and its main contributions can be summarized as follows: (1) Multi-scale feature guided sequential hashing learning method is proposed to share the information from features of different scales through an RNN-based network and generate the hash codes sequentially. The features of different scales are used to guide the hash codes generation, which can enhance the diversity of the hash codes and weaken the influence of errors in specific features, such as false object features caused by occlusion. (2) Multi-scale correlation mining strategy is proposed to align the features of different scales in different modalities and mine the correlations among aligned features. These correlations reveal the finer underlying structure of multimedia data and can help to boost the hash function learning. (3) Correlation evaluation network evaluates the importance of the correlations to select the worthwhile correlations, and increases the impact of these correlations for hash function learning. Experiments on two widely-used 2-media datasets and a 5-media dataset demonstrate the effectiveness of our proposed MCSCH approach.

Label Guided Correlation Hashing for Large-Scale Cross-Modal Retrieval

Discrete Cross-Modal Hashing for Efficient Multimedia Retrieval

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

Graph Convolutional Multi-Label Hashing for Cross-Modal Retrieval

Deep Class-guided Hashing for Multi-label Cross-modal Retrieval

Label consistent locally linear embedding based cross-modal hashing

Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval

Efficient Online Label Consistent Hashing for Large-Scale Cross-Modal Retrieval.

Adaptive Label Correlation Based Asymmetric Discrete Hashing for Cross-modal Retrieval

Label Distribution Guided Hashing for Cross-Modal Retrieval

Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval.

Label Guided Discrete Hashing for Cross-Modal Retrieval.

Adversary Guided Asymmetric Hashing for Cross-Modal Retrieval

Deep Semantic Correlation Learning Based Hashing for Multimedia Cross-Modal Retrieval

LCEMH: Label Correlation Enhanced Multi-modal Hashing for Efficient Multi-modal Retrieval

Semi-Supervised Graph Convolutional Hashing Network for Large-Scale Cross-Modal Retrieval

Sequential Cross-Modal Hashing Learning Via Multi-scale Correlation Mining

Correlation Hashing Network for Efficient Cross-Modal Retrieval.

Supervised Intra- and Inter-Modality Similarity Preserving Hashing for Cross-Modal Retrieval.

Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval

Joint Specifics and Consistency Hash Learning for Large-Scale Cross-Modal Retrieval