MC-Tree: Dynamic Index Structure for Partially Clustered Multi-Dimensional Database

靳晓明,王丽坤,陆玉昌,石纯一
2003-01-01
Tsinghua Science & Technology
Abstract:Index structure that enables efficient similarity queries in high-dimensional space is crucial for many applications. This paper discusses the indexing problem in dataset composed of partially clustered data, which exists in many applications. Current index methods are inefficient with partially clustered datasets. The dynamic and adaptive index structure presented here, called a multi-cluster tree (MC-tree), consists of a set of height-balanced trees for indexing. This index structure improves the querying efficiency in three ways: 1) Most bounding regions achieve uniform distributions, which results in fewer splits and less overlap compared with a single indexing tree. 2) The clusters in the dataset are dynamically detected when the index is updated. 3) The query process does not involve a sequential scan. The MC-tree was shown to be better than hierarchical and cluster-based indexes for the partially clustered datasets.
What problem does this paper attempt to address?