Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space.

Jiangtao Cui,Zi Huang,Bo Wang,Yingfan Liu
DOI: https://doi.org/10.1007/978-3-642-37487-6_10
2013-01-01
Abstract:One-dimensional mapping has been playing an important role for nearest neighbor search in high-dimensional space. Two typical kinds of one-dimensional mapping method, direct projection and distance computation regarding to reference points, are discussed in this paper. An optimal combination of one-dimensional mappings is achieved for the best search performance. Furthermore, we propose a near-optimal partial linear scan algorithm by considering several one-dimensional mapping values. During the linear scan, the partial distance to the query point computed in the 1D space is used as the lower bound to filter the unqualified data points. A new indexing structure based on clustering with Gaussian Mixture is also designed to facilitate the partial linear scan, which can reduce both the I/O cost and distance computations dramatically. Comprehensive experiments are conducted on several real-life datasets with different dimensions. The experimental results show that the proposed indexing structure outperforms the existing well-known high-dimensional indexing methods.
What problem does this paper attempt to address?