A Twig Join Algorithm for A Query with Id References

Dong Li,Lin Zhao,Jing Li
DOI: https://doi.org/10.1109/apscc.2014.26
2014-01-01
Abstract:ID/IDREF feature makes XML document model become graph structure rather than tree structure, while traditional Twig join algorithms can just process simple queries without ID references. Those queries with ID references often involve attribute node or predicates with expressions which do not exist in traditional Twig pattern, so it is necessary to design the Twig join algorithm for the implement of queries involving ID references. There are several typical Twig join algorithms like Twig(2)Stack, TwigList, TwigMix. Twig(2)Stack use over-complicated data structures with large memory overhead. TwigList uses simple lists but lack efficient filtering of useless elements. TwigMix simply introduces the getNext() function into TwigList to avoid manipulation of useless elements for the ancestor-descendent (AD) relationship in the stack and lists, but it will filter some useful elements when process the queries involving attribute node or predicates within expressions. To this end, we propose a new algorithm, called TwigExpand, which can process the queries involving attribute node or predicates within expressions by avoiding the manipulation of useless elements for both the parent-child (PC) relationship and AD relationship. In addition, TwigGraph is proposed by expanding TwigExpand, which can process the queries involving ID references, and it's much faster than binary structural join proved by experimental study.
What problem does this paper attempt to address?