QReduction: Synopsizing XPath Query Set Efficiently under Resource Constraint

Jun Gao,Xiuli Ma,Dongqing Yang,Tengjiao Wang,Shiwei Tang
DOI: https://doi.org/10.1007/978-3-540-27772-9_39
2004-01-01
Abstract:How to evaluate a massive XPath set over XML streams poses great challenges to database researchers. Current work chiefly focuses on evaluating efficiently massive XPath set to obtain precise results. The size of the input query set has a great impact on the resource requirement and the efficiency of evaluation. In this paper, we propose a novel method, QReduction, to obtain the synopsized XPath query set to represent the original query set, while at the same time to minimize the ’precision loss’ caused by query set synopsis. QReduction discovers frequent patterns among the massive input XPath tree patterns first, and select query set synopsis from them based on a dynamic benefit model under resource constraints. Since frequent patterns discovery takes high complexity in QReduction, we propose optimization methods by pushing the constraints of QReduction into the discovery process. We propose 3 criteria, namely recall, precision and intersection to determine a better synopsis. The experimental results demonstrate that our method can produce a query set synopsis with high precision, recall and intersection under given resource constraints.
What problem does this paper attempt to address?