Hypercube-Based High-Dimensional Index Using Co-clustering

LIU Yingfan,CUI Jiangtao
DOI: https://doi.org/10.3778/j.issn.1673-9418.2012.11.005
2012-01-01
Abstract:The performance of nearest neighbor search in high-dimensional dataset will succumb to the well-known "curse of dimensionality".This paper proposes a novel hypercube on co-clustering(HC2) index for high-dimensional query.By using the co-clustering methods,both size and dimensionality of the original dataset can be reduced simultaneously,and some low-dimensional clusters can be obtained.Each cluster is described by a bounded hypercube,and lower bounds of the actual distances between the query point and clusters can be efficiently established to achieve fast and lossless similarity search with the filter-and-refine approach.To achieve a tighter lower bounds,the paper investigates a statistically optimal description of hypercube,SOHC2(statistically optimized hypercube on co-clustering),which generates the least number of candidates for actual distance computations in the sense of statistics.Experimental results show that SOHC2 is up to 3 times faster than the other index structures based on co-clustering,and it also offers significant performance advantages over other existing methods.
What problem does this paper attempt to address?