NSDH: A Nonlinear Supervised Discrete Hashing framework for large-scale cross-modal retrieval

Zhan Yang,Liu Yang,Osolo Ian Raymond,Lei Zhu,Wenti Huang,Zhifang Liao,Jun Long
DOI: https://doi.org/10.1016/j.knosys.2021.106818
2021-04-01
Abstract:<p>Hashing technology has been widely used in approximate nearest neighbor search algorithms for large-scale cross-modal retrieval due to its significantly reduced storage and high-speed search capabilities. However, most existing supervised cross-modal hashing methods either mainly rely on binary pairwise similarity and fail to exploit the rich semantic information contained in the label matrix, or the use of single linear projections which suffer from limited information completeness. In this paper, we propose a novel method, named <strong>N</strong>onlinear <strong>S</strong>upervised <strong>D</strong>iscrete <strong>H</strong>ashing (<strong>NSDH</strong>). Specifically, NSDH consists of two components, (1) a semantic enhancement descriptor consisting of multiple linear projections that is used to extract comprehensive latent representations of heterogeneous multimedia data, which aligns the original heterogeneous features and integrates the rich semantic label matrix; (2) a fast discrete optimization module used to learn discriminative compact hash codes, which preserves the similarity information using an inner product between the real-valued embeddings of the output of the semantic enhancement descriptors. Therefore, NSDH leverages both the label matrix and similarity information in order to enhance the semantic information of the learned hash codes. In this way, the representation learning capability of the output layer of semantic enhancement descriptors can be greatly enhanced and as a result the learned hash codes are more discriminative. In addition, we present a fast discrete optimization algorithm to efficiently learn the binary hash codes. Results from our experiments on two benchmark datasets highlight the superiority of NSDH in comparison to many state-of-the-art cross-modal hashing methods.</p>
computer science, artificial intelligence
What problem does this paper attempt to address?