Abstract:The $k$ nearest neighbor ($k$NN) query is a fundamental problem in databases. Given a set of multidimensional data points and a query point, $k$NN returns the $k$ nearest neighbors based on a scoring function such as weighted sum given an attribute weight vector. However, the attribute weight vector can be difficult to specify in practice. Skyline returns the points including all possible nearest neighbors without requiring the exact attribute weight vector or a scoring function but the number of returned points can be prohibitively large for practical use. In this paper, we propose a novel \emph{eclipse} definition which provides a more flexible and customizable definition than the classic $1$NN and skyline. In eclipse, users can specify a range of attribute weights and control the number of returned points. We show that both $1$NN and skyline are instantiations of eclipse. To compute eclipse points, we propose a baseline algorithm with time complexity of $O(n^22^{d-1})$, and an improved $O(n\log ^{d-1}n)$ time transformation-based algorithm by transforming the eclipse problem to the skyline problem, where $n$ is the number of points and $d$ is the number of dimensions. Furthermore, we propose a novel index-based algorithm utilizing duality transform with much better efficiency. The experimental results on the real NBA dataset and the synthetic datasets demonstrate the effectiveness and efficiency of our eclipse algorithms.

Efficient index-based KNN join processing for high-dimensional data

Accelerating Exact Nearest Neighbor Search in High Dimensional Euclidean Space Via Block Vectors

Eclipse: Practicability Beyond Knn and Skyline

Eclipse: Generalizing Knn and Skyline

High-dimensional kNN joins with incremental updates

Exploring Bit-Difference for Approximate KNN Search in High-Dimensional Databases

Composite Distance Transformation for Indexing and K -Nearest-neighbor Searching in High-Dimensional Spaces

Constrained All-k-Nearest-Neighbor Search

iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

Preserving-Ignoring Transformation Based Index for Approximate k Nearest Neighbor Search

Efficient Processing of k Nearest Neighbor Joins using MapReduce

Efficient Parallel Processing of High-Dimensional Spatial K NN Queries

Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications

Efficient Data-aware Distance Comparison Operations for High-Dimensional Approximate Nearest Neighbor Search

Diagonal Ordering: A New Approach to High-Dimensional KNN Processing.

Adaptive Quantization Of The High-Dimensional Data For Efficient Knn Processing

Contorting High Dimensional Data for Efficient Main Memory KNN Processing

FLEX: A Fast and Light-weight Learned Index for kNN Search in High-Dimensional Space

On efficient mutual nearest neighbor query processing in spatial databases

Efficient parallel processing for K-nearest-neighbor search in spatial databases

Indexing High-Dimensional Data in Dual Distance Spaces