Lazylsh: Approximate Nearest Neighbor Search For Multiple Distance Functions With A Single Index

Yuxin Zheng,Qi Guo,Anthony K. H. Tung,Sai Wu
DOI: https://doi.org/10.1145/2882903.2882930
2016-01-01
Abstract:Due to the "curse of dimensionality" problem, it is very expensive to process the nearest neighbor (NN) query in high dimensional spaces; and hence, approximate approaches, such as Locality-Sensitive Hashing (LSH), are widely used for their theoretical guarantees and empirical performance. Current LSH-based approaches target at the l(1) and l(2) spaces, while as shown in previous work, the fractional distance metrics (l(p) metrics with 0 < p < 1) can provide more insightful results than the usual l(1) and l(2) metrics for data mining and multimedia applications. However, none of the existing work can support multiple fractional distance metrics using one index. In this paper, we propose LazyLSH that answers approximate nearest neighbor queries for multiple l(p) metrics with theoretical guarantees. Different from previous LSH approaches which need to build one dedicated index for every query space, LazyLSH uses a single base index to support the computations in multiple l(p) spaces, significantly reducing the maintenance overhead. Extensive experiments show that LazyLSH provides more accurate results for approximate kNN search under fractional distance metrics.
What problem does this paper attempt to address?