Abstract:Recently locality-sensitive hashing (LSH) algorithms have attracted much attention owing to its empirical success and theoretic guarantee in large-scale visual search. In this paper we address the new topic of hashing with multi-label data, in which images in the database are assumed to be associated with missing or noisy multiple labels and each query consists of a query image and several textual search terms, similar to the new "Search with Image" function introduced by the Google Image Search. The returned images are judged based on the combination of visual similarity and semantic information conveyed by search terms. In most of the state-of-the-art approaches, the learned hashing functions are universal for all labels. To further enhance the hashing efficiency for such multi-label data, we propose a novel scheme "boosted shared hashing". Our basic observation is that image labels typically form cliques in the feature space. Hashing efficacy can be greatly improved by making each hashing function more targeted at and only shared across such cliques instead of all labels in conventional hashing methods. In other words, each hashing function is deliberately designed such that it is especially effective for a subset of labels. The targeted, but sparse association between labels and hash bits reduces the computation and storage when indexing a new datum, since only a small number of relevant hashing functions become active given the labels. We develop a Boosting-style algorithm for simultaneously optimizing the label subset and hashing function in a unified framework. Experimental results on standard image benchmarks like CIFAR-10 and NUS-WIDE show that the proposed hashing scheme achieves substantially superior performances over conventional methods in terms of accuracy under the same hash bit budget.

Enhancing Multi-Label Deep Hashing for Image and Audio With Joint Internal Global Loss Constraints and Large Vision-Language Model

Nonlinear Discrete Cross-Modal Hashing for Visual-Textual Data

Scalable Multimedia Retrieval By Deep Learning Hashing With Relative Similarity Learning

Deep Multi-Label Hashing For Large-Scale Visual Search Based On Semantic Graph

Multiple Hierarchical Deep Hashing for Large Scale Image Retrieval

Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning.

Deep Visual-Semantic Hashing for Cross-Modal Retrieval

Improve Deep Hashing with Language Guidance for Unsupervised Image Retrieval

TransHash: Transformer-based Hamming Hashing for Efficient Image Retrieval

Deep Multi-View Enhancement Hashing for Image Retrieval.

Deep Collaborative Multi-View Hashing for Large-Scale Image Search

Deep Hashing with Multi-Central Ranking Loss for Multi-Label Image Retrieval.

Compact hashing for mixed image-keyword query over multi-label images

Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval

Specific class center guided deep hashing for cross-modal retrieval

Multi-scale Consistency Deep Lifelong Cross-modal Hashing

Deep Hashing: A Joint Approach for Image Signature Learning.

Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search

Large-Scale Multi-Task Image Labeling with Adaptive Relevance Discovery and Feature Hashing

Joint Specifics and Consistency Hash Learning for Large-Scale Cross-Modal Retrieval

Unsupervised Deep Cross-modal Hashing with Virtual Label Regression