Design and Implementation of a Parallel Data Partitioning Algorithm for XML Data

汤南,于亚新,王国仁,于戈
DOI: https://doi.org/10.3969/j.issn.1000-1220.2004.07.014
2004-01-01
Abstract:With the wide use of XML in many applications over the Web, the scale and the size of XML documents are increasing rapidly and the query processing becomes more complicated than in traditional databases. Centralized environments cannot meet the requirements of Web applications well due to the problem of I/O bottleneck caused by XML documents with large scale and huge size. Parallel query processing is one of promising approaches to solving the bottleneck and data partitioning is one of key issues of the parallel query processing. In this paper we propose a Node based Round Robin data partitioning, short for NRR, to partition a huge size XML document so that queries on the document can be processed in parallel. Our experimental results show that the method has good performance of both speedup and scaleup.
What problem does this paper attempt to address?