Visual word coding based on difference maximization.

Yu-Bin Yang,Ling-Yan Pan,Yang Gao,Guang-Nan He,Yao Zhang
DOI: https://doi.org/10.1016/j.neucom.2012.08.050
IF: 6
2013-01-01
Neurocomputing
Abstract:Image classification is an important topic in computer vision, which becomes more and more challenging due to the rapid increase of the amounts and categories of images, as well as the different geometric deformations and illumination variations existing in image objects. “Bag-of-Features” model, also known as “Bag-of-Words” model, plays a fundamental and crucial role in generating efficient and discriminative image content representations by using local descriptors such as visual words, making it widely used in solving image classification problems. However, because of the weak discriminative power and strong ambiguity of the low-level visual features, the visual codebook, i.e., a set of visual words, generated in this model is usually over-completed and inconsistent for capturing image semantics. To address this issue, we propose a novel visual word coding algorithm in this paper based on difference maximization technique to improve the generated codebook model. Instead of mapping an image feature vector to one or multiple nearest visual words, the proposed approach utilizes a group of the nearest and the farthest visual words together in the coding process. Consequently, the representative variations of different image features are well kept and strengthened, which can then improve the discriminative power of the visual word descriptor significantly. We examine the performance of our visual word coding model extensively on four standard real-world image datasets, demonstrating that it captures image semantic content more accurately and achieves superior classification performance.
What problem does this paper attempt to address?