Multi-Facet Weighted Asymmetric Multi-Modal Hashing Based on Latent Semantic Distribution

Xu Lu,Li Liu,Lixin Ning,Liang Zhang,Shaomin Mu,Huaxiang Zhang
DOI: https://doi.org/10.1109/tmm.2024.3363664
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:With the advent of multi-modal data, multi-modal hashing has received increasing attention for it can configure complementary multi-modal fusion and support fast multimedia retrieval. Nevertheless, the “coarse-grained” modality weighting strategy widely used in existing methods always ignores the distinctive contributions of different features and is troubled by parameter adjustment. Besides, traditional supervised methods usually adopt “hard semantic” that reflects the logical relationship between data and labels, but fails to poring on the description degree of categories to data. To solve these problems, we propose a multi-Facet weIghting aSymmetric Multi-modal Hashing based on latent semantic distribution (FISMH) approach, which is divided into supervised paradigm SFISMH and unsupervised paradigm UFISMH. First, we design a Multi-facet Weighted Multi-modal Fusion module that utilizes both modality- and feature- wise weights to achieve multi-modal fusion, where the weight learning requires no additional parameter adjustment. Then, we design a Latent Semantic Distribution based Asymmetric Hash Learning module, which utilizes the pair- wise similarity and semantic distribution to guide hash learning, and avoids the challenging pair- wise factorization through asymmetric form. The semantic distribution is learned from the inherent information of feature space, which can further preserve the intra-class relationships. Finally, a discrete hash optimization is developed to reduce quantization and directly learn hash codes. The main difference between SFISMH and UFISMH is that the former utilizes category information while the latter explores the underlying data structure when constructing the pair- wise similarity. Extensive experiments demonstrate that both SFIMH and UFISMH outperform existing supervised and unsupervised multi-modal hashing methods, showcasing their exceptional performance.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?