Abstract:In recent years, the bag of visual words (BoV) model is very popular in the field of computer vision. In particular, it has been widely used for image classification. The earliest method based on the BoV model, when constructing the visual dictionary, uses the clustering method to generate a single visual dictionary. This method does not consider the differences between categories of image dataset, leading to a poor performance of image classification. Later, an improved method is proposed that builds a visual dictionary for each category in the image dataset and then concatenates the visual dictionaries to produce the final visual dictionary. However, when there are many categories in the image dataset, using this method is time consuming. In addition, building a visual dictionary for each category leads to the visual words being too dense, losing the ability of distinguishing and expressing information and decreasing the performance of the image classification. In this paper, we proposed a new image classification method that is based on multiple visual dictionaries. This method addresses the above problems effectively and improves the performance of image classification significantly. To build a visual dictionary, the proposed image classification method first divides the image dataset containing many categories into N items randomly and uses the clustering method to cluster the local features of images of N items respectively. Next it generates a visual dictionary for each item in the image dataset. Finally it connects the visual dictionaries to produce the final visual dictionary of the entire image dataset. We conducted experiments on the caltech-101 and scene-67 image libraries and compared to the traditional methods. Our experimental results show that our proposed image classification method can match the best results published in the previous literatures in terms of classification accuracy rate.

Visual tag dictionary: interpreting tags with visual words

Generating descriptive visual words and visual phrases for large-scale image applications

Building Descriptive and Discriminative Visual Codebook for Large-Scale Image Applications.

Towards a Universal and Limited Visual Vocabulary.

Creating Descriptive Visual Word Tree for Tag Ranking of Social Image.

An experimental study on the universality of visual vocabularies

VSAM-Based Visual Keyword Generation for Image Caption

Word2Image: towards visual interpreting of words.

Word 2 Image : Towards Visual Interpretation of Words

Bag-of-Visual-Words Model Based on Classified Vector Quantization and Its Application in Image Classification

Large Visual Words For Large Scale Image Classification

Creating Descriptive Visual Words For Tag Ranking Of Compressed Social Image

Visual topic model for web image annotation.

Visual Distinctive Language: Using a Hypertopic-Based Iconic Tagging System for Knowledge Sharing

Learning attribute-aware dictionary for image classification and search

Tag2Text: Guiding Vision-Language Model via Image Tagging

From visual words to a visual grammar: using language modelling for image classification

Visual word coding based on difference maximization.

Using Visual Dictionary to Associate Semantic Objects in Region-Based Image Retrieval

Visual Stem Mapping and Geometric Tense Coding for Augmented Visual Vocabulary

An Image Classification Method Based on Multiple Visual Dictionaries