Abstract:In recent years, the bag of visual words (BoV) model is very popular in the field of computer vision. In particular, it has been widely used for image classification. The earliest method based on the BoV model, when constructing the visual dictionary, uses the clustering method to generate a single visual dictionary. This method does not consider the differences between categories of image dataset, leading to a poor performance of image classification. Later, an improved method is proposed that builds a visual dictionary for each category in the image dataset and then concatenates the visual dictionaries to produce the final visual dictionary. However, when there are many categories in the image dataset, using this method is time consuming. In addition, building a visual dictionary for each category leads to the visual words being too dense, losing the ability of distinguishing and expressing information and decreasing the performance of the image classification. In this paper, we proposed a new image classification method that is based on multiple visual dictionaries. This method addresses the above problems effectively and improves the performance of image classification significantly. To build a visual dictionary, the proposed image classification method first divides the image dataset containing many categories into N items randomly and uses the clustering method to cluster the local features of images of N items respectively. Next it generates a visual dictionary for each item in the image dataset. Finally it connects the visual dictionaries to produce the final visual dictionary of the entire image dataset. We conducted experiments on the caltech-101 and scene-67 image libraries and compared to the traditional methods. Our experimental results show that our proposed image classification method can match the best results published in the previous literatures in terms of classification accuracy rate.

Multiple Instance Learning Using Visual Phrases for Object Classification

Multi-View Analysis Dictionary Learning for Image Classification.

SLV: Spatial Likelihood Voting for Weakly Supervised Object Detection

Multiple-Instance Learning from Pairwise Comparison Bags

Visual Object Recognition in Diverse Scenes with Multiple Instance Learning.

SC-MIL: Sparsely Coded Multiple Instance Learning for Whole Slide Image Classification

Image Annotation by Multiple-Instance Learning with Discriminative Feature Mapping and Selection

An Image Classification Method Based on Multiple Visual Dictionaries

Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification

Visual language modeling for image classification.

One-Class multiple instance learning via robust PCA for common object discovery

On Combining Multiple Instance Linear SVM and Bag Splitting for High Performance Visual Object Localization

Rethinking Multiple Instance Learning for Whole Slide Image Classification: A Bag-Level Classifier is a Good Instance-Level Teacher

Simultaneous instance pooling and bag representation selection approach for multiple-instance learning (MIL) using vision transformer

Multiple Instance Curriculum Learning for Weakly Supervised Object Detection

A New Multiple Instance Algorithm Using Structural Information.

A New multi-instance multi-label learning approach for image and text classification

Multi-order Visual Phrase for Scalable Partial-Duplicate Visual Search

Discriminative Bag-of-visual Phrase Learning for Landmark Recognition

Multi-Instance Multi-Label Learning with Application to Scene Classification

Multiple instance learning: A survey of problem characteristics and applications