A novel three-phase XML twig pattern matching algorithm based on version tree

Guiquan Liu,Meiling Yao,Desheng Wang,Enhong Chen
DOI: https://doi.org/10.1109/FSKD.2011.6019809
2011-01-01
Abstract:At present, there are three main research directions in querying and searching XML data: structure index method, node-based encoding method and sequence method. However, a common problem of querying and searching XML data is that the execution time as well as the input size of algorithms grows rapidly as the size of XML document increases. To overcome this problem, we propose a new three-phase XML twig pattern matching algorithm called Twig3Version. The new algorithm firstly executes holistic XML twig pattern matching algorithm on the structure index named Version Tree that compresses all repetitive structures in XML document, and returns subtrees of Version Tree that matching query twig in structure. Then the algorithm implements a simple and efficient version filter module on the concise intermediate results to find matching versions. Finally, it merges elements in the original document corresponding to these matching versions to generate final results. Because the new algorithm executes structural matching on the concise structure index and implements a simple and efficient version filter module on the concise intermediate results, the new algorithm outperforms other existing XML twig pattern matching algorithms. Both theoretical analysis and experimental results indicate the superiority of the new algorithm.
What problem does this paper attempt to address?