Making the Pyramid Technique Robust to Query Types and Workloads

Rui Zhang,Beng Chin Ooi,Kian-Lee Tan
DOI: https://doi.org/10.1109/ICDE.2004.1320007
2004-01-01
Abstract:The effectiveness of many existing high-dimensional indexingstructures is limited to specific types of queries andworkloads. For example, while the Pyramid technique andthe iMinMax are efficient for window queries, the iDistanceis superior for kNN queries. In this paper, we present anew structure, called the P+-tree, that supports both windowqueries and kNN queries under different workloads efficiently.In the P+-tree, a B+-tree is employed to indexthe data points as follows. The data space is partitionedinto subspaces based on clustering, and points in each subspaceare mapped onto a single dimensional space using thePyramid technique, and stored in the B+-tree. The crux ofthe scheme lies in the transformation of the data which hastwo crucial properties. First, it maps each subspace intoa hypercube so that the Pyramid technique can be applied.Second, it shifts the cluster center to the top of the pyramid,which is the case that the Pyramid technique worksvery efficiently. We present window and kNN query processingalgorithms for the P+-tree. Through an extensiveperformance study, we show that the P+-tree has considerablespeedup over the Pyramid technique and the iMinMaxfor window queries and outperforms the iDistance for kNN queries.
What problem does this paper attempt to address?