Abstract:The MPEG compact descriptors for visual search (CDVS) is a standard toward image matching and retrieval. To achieve high retrieval accuracy over a large scale image/video dataset, recent research efforts have demonstrated that employing extremely high-dimensional descriptors such as the Fisher vector (FV) and the vector of locally aggregated descriptors (VLAD) can yield good performance. Since the FV (or VLAD) possesses high discriminability but small visual vocabulary, it has been adopted by CDVS to construct a global compact descriptor. In this paper, we study the development of global compact descriptors in the completed CDVS standard and the emerging compact descriptors for video analysis (CDVA) standard, in which we formulate the FV (or VLAD) compression as a resource-constrained optimization problem. Accordingly, we propose a codebook-free aggregation method via dual selection to generate a global compact visual descriptor, which supports fast and accurate feature matching free of large visual codebooks, fulfilling the low memory requirement of mobile visual search at significantly reduced latency. Specifically, we investigate both sample-specific Gaussian component redundancy and bit dependency within a binary aggregated descriptor to produce compact binary codes. Our technique contributes to the scalable compressed Fisher vector (SCFV) adopted by the CDVS standard. Moreover, the SCFV descriptor is currently serving as the frame-level hand-crafted video feature, which inspires the inheritance of CDVS descriptors for the emerging CDVA standard. Furthermore, we investigate the positive complementary effect of our standard compliant compact descriptor and deep learning based features extracted from convolutional neural networks with significant mean average precision gains. Extensive evaluation over benchmark databases shows the significant merits of the codebook-free binary codes for scalable visual search.

Fast And Compact Visual Codebook For Large-Scale Object Retrieval

A Fast Algorithm For Creating A Compact And Discriminative Visual Codebook

Visual Codebook Construction for Class-Specific Recognition

Compact Codebook Generation Towards Scale-Invariance

Learning Optimal Compact Codebook for Efficient Object Categorization

Task-dependent visual-codebook compression.

Codebook-Free Compact Descriptor for Scalable Visual Search.

Fast Codebook Design Method for Image Vector Quantisation

Large Visual Words For Large Scale Image Classification

Incremental Codebook Adaptation for Visual Representation and Categorization

Learning Multiple Codebooks for Low Bit Rate Mobile Visual Search

A novel visual codebook model based on fuzzy geometry for large-scale image classification

Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search

Category Sensitive Codebook Construction for Object Category Recognition

Beyond Explicit Codebook Generation: Visual Representation Using Implicitly Transferred Codebooks

Codebook Enhancement of Vlad Representation for Visual Recognition.

Fast Object Retrieval Using Direct Spatial Matching

Image Representation Based on Multiple Visual Codebooks

Building Descriptive and Discriminative Visual Codebook for Large-Scale Image Applications.

Unifying Discriminative Visual Codebook Generation with Classifier Training for Object Category Recognition

Discriminative Spatial Codebook Generation for Image Classification