Deep Saliency Hashing

Sheng Jin,Hongxun Yao,Xiaoshuai Sun,Shangchen Zhou,Lei Zhang,Xiansheng Hua

DOI: https://doi.org/10.48550/arXiv.1807.01459

2019-02-01

Abstract:In recent years, hashing methods have been proved to be effective and efficient for the large-scale Web media search. However, the existing general hashing methods have limited discriminative power for describing fine-grained objects that share similar overall appearance but have subtle difference. To solve this problem, we for the first time introduce the attention mechanism to the learning of fine-grained hashing codes. Specifically, we propose a novel deep hashing model, named deep saliency hashing (DSaH), which automatically mines salient regions and learns semantic-preserving hashing codes simultaneously. DSaH is a two-step end-to-end model consisting of an attention network and a hashing network. Our loss function contains three basic components, including the semantic loss, the saliency loss, and the quantization loss. As the core of DSaH, the saliency loss guides the attention network to mine discriminative regions from pairs of images. We conduct extensive experiments on both fine-grained and general retrieval datasets for performance evaluation. Experimental results on fine-grained datasets, including Oxford Flowers-17, Stanford Dogs-120, and CUB Bird demonstrate that our DSaH performs the best for fine-grained retrieval task and beats the strongest competitor (DTQ) by approximately 10% on both Stanford Dogs-120 and CUB Bird. DSaH is also comparable to several state-of-the-art hashing methods on general datasets, including CIFAR-10 and NUS-WIDE.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in large - scale image retrieval tasks, the discriminative ability of existing general - purpose hashing methods in describing fine - grained objects is limited. Fine - grained objects refer to those objects with similar appearances but subtle differences. To overcome this challenge, for the first time, the author introduced the attention mechanism into the learning of fine - grained hash codes and proposed a new model named Deep Saliency Hashing (DSaH). The DSaH model can automatically mine salient regions and simultaneously learn semantically - preserved hash codes. Specifically, DSaH is a two - step end - to - end model, consisting of an attention network and a hash network. The loss function contains three basic components: semantic loss, saliency loss and quantization loss. Among them, the saliency loss guides the attention network to mine discriminative regions from image pairs. Verified by extensive experiments, DSaH outperforms the existing strongest competitor (DTQ) on fine - grained retrieval datasets and is comparable to several state - of - the - art hashing methods on general - purpose datasets.

Deep Saliency Hashing

Deep Saliency Hashing for Fine-Grained Retrieval.

Deep Attention-guided Hashing

Attention-based Saliency Hashing for Ophthalmic Image Retrieval

Image Retrieval via Balanced and Maximum Variance Deep Hashing.

Deep Hashing Based on Class-Discriminated Neighborhood Embedding

Deep Progressive Hashing for Image Retrieval

Deep Self-Adaptive Hashing for Image Retrieval

SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval

Image Retrieval Using a Deep Attention-Based Hash.

Deep Contrastive Self-Supervised Hashing for Remote Sensing Image Retrieval

DHA: Supervised Deep Learning to Hash with an Adaptive Loss Function.

Deep Unsupervised Hashing with Selective Semantic Mining

Deep Spatial Attention Hashing Network for Image Retrieval.

Deep Attention Sampling Hashing for Efficient Image Retrieval.

Deep Semantic-Preserving and Ranking-Based Hashing for Image Retrieval.

Deep Multi-level Hashing Codes for Image Retrieval.

Deep Hashing Network with Hybrid Attention and Adaptive Weighting for Image Retrieval

Deep Self-Learning Hashing for Image Retrieval

Robust Deep Supervised Hashing for Image Retrieval

Deep Multi-Label Hashing For Large-Scale Visual Search Based On Semantic Graph