Abstract:Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC. To alleviate this limitation, in this work, we break the convention which assumes that a local feature is drawn from one of a few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as a linear combination of multiple key components, and the combination weight is a latent random variable. In doing so we greatly enhance the representative power of the generative model underlying FVC. To implement our idea, we design two particular generative models following this compositional approach. In our first model, the mean vector is sampled from the subspace spanned by a set of bases and the combination weight is drawn from a Laplace distribution. In our second model, we further assume that a local feature is composed of a discriminative part and a residual part. As a result, a local feature is generated by the linear combination of discriminative part bases and residual part bases. The decomposition of the discriminative and residual parts is achieved via the guidance of a pre-trained supervised coding method. By calculating the gradient vector of the proposed models, we derive two new Fisher vector coding strategies. The first is termed Sparse Coding-based Fisher Vector Coding (SCFVC) and can be used as the substitute of traditional GMM based FVC. The second is termed Hybrid Sparse Coding-based Fisher vector coding (HSCFVC) since it combines the merits of both pre-trained supervised coding methods and FVC. Using pre-trained Convolutional Neural Network (CNN) activations as local features, we experimentally demonstrate that the proposed methods are superior to traditional GMM based FVC and achieve state-of-the-art performance in various image classification tasks.

Beyond Local Phase Quantization: Mid-Level Blurred Image Representation Using Fisher Vector

Taming Vector-Wise Quantization for Wide-Range Image Blending with Smooth Transition

Action Recognition with Stacked Fisher Vectors.

Automatic blur region segmentation approach using image matting

Spatial Weighted Fisher Vector for Image Retrieval

Encoding High Dimensional Local Features By Sparse Coding Based Fisher Vectors

Efficiently Exploiting Spatially Variant Knowledge for Video Deblurring

Robust Face Recognition in Low Resolution and Blurred Image Using Joint Information in Space and Frequency.

Compositional Model Based Fisher Vector Coding for Image Classification

An image quality assessment method based on edge extraction and singular value for blurriness

Fuzzy-Kernel Learning Vector Quantization

Refining local descriptors by embedding semantic information for visual categorization.

Fuzzy operator infrared image deblurring algorithm for image blurring in dragon boat races

Reference-Based Multi-Level Features Fusion Deblurring Network for Optical Remote Sensing Images

Color Multi-Fusion Fisher Vector Feature for Fine Art Painting Categorization and Influence Analysis.

Fusing local patterns of Gabor magnitude and phase for face recognition.

Image Deblurring based on Lightweight Multi-Information Fusion Network

Spherical Centralized Quantization for Fast Image Retrieval

Compact Representation of High-Dimensional Feature Vectors for Large-Scale Image Recognition and Retrieval

Selectively Aggregated Fisher Vectors of Query Video for Mobile Visual Search

Robust fisher codes for large scale image retrieval