A More Effective Method for Image Representation: Topic Model Based on Latent Dirichlet Allocation.

Zongmin Li,Weiwei Tian,Yante Li,Zhenzhong Kuang,Yujie Liu
DOI: https://doi.org/10.1109/cadgraphics.2015.19
2015-01-01
Abstract:Nowadays, the Bag-of-words(BoW) representation is well applied to recent state-of-the-art image retrieval works. However, with the rapid growth in the number of images, the dimension of the dictionary increases substantially which leads to great storage and CPU cost. Besides, the local features do not convey any semantic information which is very important in image retrieval. In this paper, we propose to use "topics" instead of "visual words" as the image representation by topic model to reduce the feature dimension and mine more high-level semantic information. We call this as Bag-of-Topics(BoT) which is a type of statistical model for discovering the abstract "topics" from the words. We extract the topics by Latent Dirichlet Allocation (LDA) and calculate the similarity between the images using BoT model instead of BoW directly. The results show that the dimension of the image representation has been reduced significantly, while the retrieval performance is improved.
What problem does this paper attempt to address?