UD(k, l)-Index: An Efficient Approximate Index for XML Data
Hongwei Wu,Qing Wang,Jeffrey Xu Yu,Aoying Zhou,Shuigeng Zhou
DOI: https://doi.org/10.1007/978-3-540-45160-0_8
2003-01-01
Abstract:XML has become the main standard of data presentation and exchange on the Internet. Processing path expressions plays a key role in XML queries evaluation. Path indices can speed up path expressions evaluation on XML data by restricting search only to the relevant portion. However, to answer all path expressions accurately, traditional path indices group data nodes according to the paths from the root of the data graph to the nodes in question, regardless of the paths fanning out from these nodes. This leads to large indices size and low efficiency of branching path expressions evaluation. In this paper, we present UD(k, l)-indices, a family of efficient approximate index structures in which data nodes are grouped according to their incoming paths of length up to k and outgoing paths of length up to l. UD(k, l)-indices fully exploit local similarity of XML data nodes on their upward and downward paths, so can be used for efficiently evaluating path expressions, especially branching path expressions. For small values of k and l, UD(k, l)-index is approximate, we use validation-based approach to find exact answers to the path expressions. Experiments show that with proper values of k and l, UD(k, l)-index can improve the performance of path expressions evaluation significantly with low space overhead.