Abstract:Bag-of-visual Words (BoW) image representation has been illustrated as one of the most promising solutions for large-scale near-duplicated image retrieval. However, the traditional visual vocabulary is created in an unsupervised way by clustering a large number of image local features. This is not ideal because it largely ignores the semantic and spatial contexts between local features. In this paper, we propose the geometric visual vocabulary which captures the spatial contexts by quantizing local features in bi-space, i.e., in descriptor space and orientation space. Then, we propose to capture the semantic context by learning a semantic-aware distance metric between local features, which could reasonably measure the semantic similarities between image patches, from which the local features are extracted. The learned distance is hence utilized to cluster the local features for semantic visual vocabulary generation. Finally, we combine the spatial and semantic contexts in a unified framework by extracting local feature groups, computing the spatial configurations between the local features inside the group, and learning a semantic-aware distance between groups. The learned group distance is then utilized to cluster the extracted local feature groups to generate a novel visual vocabulary, i.e., the contextual visual vocabulary. The proposed visual vocabularies, i.e., geometric visual vocabulary, semantic visual vocabulary and contextual visual vocabulary are tested in large-scale near-duplicated image retrieval applications. The geometric visual vocabulary and semantic visual vocabulary achieve better performance than the traditional visual vocabulary. Moreover, the contextual visual vocabulary, which combines both spatial and semantic clues outperforms the state-of-the-art bundled feature in both retrieval precision and efficiency.

Database Saliency for Fast Image Retrieval

Visual Search Based on Contour Salient

Fast Object Retrieval Using Direct Spatial Matching

Saliency oriented object image re-ranking

Saliency Map-based Image Retrieval using Invariant Krawtchouk Moments

Partial-Duplicate Image Retrieval via Saliency-Guided Visual Matching

Multi-Scale Visual Words For Object-Based Web Image Search

Color Image Retrieval Utilizing Extended Fast Vq Codeword Search Technique And Vector Composition-Based Feedback

Spatially aware feature selection and weighting for object retrieval

Allocating images and selecting image collections for distributed visual search

Saliency-aware Semantic Image Coding for Mobile Visual Search

Query Expansion by Spatial Co-Occurrence for Image Retrieval

Fine-Grained Image Search

Matching Content-based Saliency Regions for partial-duplicate image retrieval

Object-Based Image Retrieval With Attention Analysis And Spatial Re-Ranking

Exploiting visual word co-occurrence for image retrieval.

Modeling spatial and semantic cues for large-scale near-duplicated image retrieval

Learning to Combine Ad-hoc Ranking Functions for Image Retrieval

Relevance of useful visual words in object retrieval

Salable Image Search with Reliable Binary Code

An Improved Remote Sensing Image Retrieval Method Based on Bag of Word Framework