Abstract:Bag-of-words model (BoW), inspired by the problem of text representation and classification, has attracted intensive attention in object and scene categorization for its flexibility and good performance. In BOW model, a visual vocabulary is obtained by clustering local patches detected from training image set, and then an image can be represented by the histogram of visual words. However, how to construct an effective visual vocabulary is still a crucial and challenging step in BoW model. The conventional methods to construct a visual vocabulary are very time-consuming, and also the obtained vocabulary is not discriminative enough. We propose a novel approach to construct a visual vocabulary: Combined Category Visual Vocabulary (CCVV). Firstly, a category visual vocabulary for each image category is obtained. Then all these category visual vocabularies are combined together to form a general visual vocabulary, which is able to be used to represent images. The visual words in CCVV are related to one image category, so they have higher discriminative ability to separate the image category from others. The proposed approach also decreases the computational complexity by clustering local patches from only one category instead of all categories. Object interest local patches are obtained by means of the Harris-Affine detector and described by scale invariant feature transform (SIFT) descriptor. Support vector machine (SVM) is utilized to train a classifier in our experiment. The proposed approach is evaluated on the VOC 2006 database, and the experimental results demonstrate that the proposed approach is more computationally efficient and superior performance than conventional approaches.

Visual saliency coding for image categorization

Efficient Classification Using Salient Regions

Color boosted visual saliency detection and its application to image classification

A Pca Based Automatic Image Categorization Approach Using Dominant Color Features

Salient Coding for Image Classification

Integrating ILSR to Bag-of-Visual Words Model Based on Sparse Codes of SIFT Features Representations

Refining local descriptors by embedding semantic information for visual categorization.

Image Categorization Based On Spatial Visual Vocabulary Model

A Saliency-based Weakly-supervised Network for Fine-Grained Image Categorization

Combined Segmentation And Visual Attention For Object Categorization And Video Semantic Concepts Detection

Context-Aware and Locality-Constrained Coding for Image Categorization

Incorporating Contextual Information into Bag-of-Visual-Words Framework for Effective Object Categorization

Visual Saliency Detection Based on Region Descriptors and Prior Knowledge

An Approach for Image Retrieval Based on Visual Saliency

Measuring Conceptual Relation of Visual Words for Visual Categorization

Image Retrieval with Saliency Object Weighted and Bag of Visual Pair

Optimal operations for visual categorization.

Visual Attention Based Bag-of-words Model for Image Classification

Image Classification Based on Saliency Coding with Category-Specific Codebooks

Combined Category Visual Vocabulary: A new approach to visual vocabulary construction

One step beyond bags of features: Visual categorization using components