ISIS: a new approach for efficient similarity search in sparse databases

Bin Cui,Jiakui Zhao,Gao Cong
DOI: https://doi.org/10.1007/978-3-642-12098-5_18
2010-01-01
Abstract:High-dimensional sparse data is prevalent in many real-life applications. In this paper, we propose a novel index structure for accelerating similarity search in high-dimensional sparse databases, named ISIS, which stands for Indexing Sparse databases using Inverted fileS. ISIS clusters a dataset and converts the original high-dimensional space into a new space where each dimension represents a cluster; furthermore, the key values in the new space are used by Inverted-files indexes. We also propose an extension of ISIS, named ISIS+, which partitions the data space into lower dimensional subspaces and clusters the data within each subspace. Extensive experimental study demonstrates the superiority of our approaches in high-dimensional sparse databases.
What problem does this paper attempt to address?