Efficient Processing of Ordered XML Twig Pattern Matching Based on Extended Dewey
Jin-hua Jiang,Ke Chen,Xiao-yan Li,Gang Chen,Li-dan Shou
DOI: https://doi.org/10.1631/jzus.a0920006
2009-01-01
Abstract:Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the state-of-the-art algorithms. However, ordered XML twig join is mentioned rarely in the literature and previous algorithms developed in attempts to solve the problem of ordered twig pattern (OTP) matching have poor performance. In this paper, we first propose a novel children linked stacks encoding scheme to represent compactly the partial ordered twig join results. Based on this encoding scheme and extended Dewey, we design a novel holistic OTP matching algorithm, called OTJFast, which needs only to access the labels of the leaf query nodes. Furthermore, we propose a new algorithm, named OTJFaster, incorporating three effective optimization rules to avoid unnecessary computations. This works well on available indices (such as B + -tree), skipping useless elements. Thus, not only is disk access reduced greatly, but also many unnecessary computations are avoided. Finally, our extensive experiments over both real and synthetic datasets indicate that our algorithms are superior to previous approaches.