Fast Search In Large-Scale Image Database Using Vector Quantization

Hj Ye,Gy Xu
DOI: https://doi.org/10.1007/3-540-45113-7_47
2003-01-01
Abstract:Practical content-based image retrieval systems require efficient indexing schemes for fast searches. Researchers have proposed many methods using space and data partitioning for exact similarity searches. However, traditional indexing methods perform poorly and will degrade to simple sequential scans at high dimensionality - that is so-called "curse of dimensionality". Recently, several filtering approaches based on vector approximation (VA) were proposed and showed promising performance. In fact, existing VA-based methods assume independent distribution of dataset and utilize scalar quantizer to partition each dimension of data space. In real databases, however, images are from different categories and often clustered. In this paper, a novel indexing method using vector quantization is proposed. This approach introduces a vector quantizer to partition data space. It assumes a Gaussian mixture distribution and estimates this distribution through Expectation-Maximization (EM) method. Experiments on a large database of 275,465 images demonstrated a remarkable improvement of retrieval efficiency.
What problem does this paper attempt to address?