Exploit Sequencing to Accelerate Hot XML Query Pattern Mining

Jianhua Feng,Qian,Jianyong Wang,Lizhu Zhou
DOI: https://doi.org/10.1145/1141277.1141400
2006-01-01
Abstract:Speeding up query evaluation in large XML repositories becomes a challenging and all-important problem with vast XML-related applications arising. Upon discovery of hot XML query patterns, indexing and caching can be effectively adopted for query performance enhancement. Previous algorithms for finding hot query patterns basically introduced a straightforward generate-and-test strategy. In this paper, we present, SOLARIA, an efficient algorithm for mining hot XML query patterns without candidate maintenance and costly tree-containment checking. Efficient algorithm of sequence mining is involved in discovering frequent tree-structured patterns, which aims at replacing expensive containment testing with cheap parent-child checking in sequences. SOLARIA deeply prunes unrelated search space for frequent pattern enumeration by parent-child relationship constraint. With the motivation of indexing and caching in XML query optimization, we also propose the derived algorithm SOLARIA for mining hot "closed" XML query patterns which provide compact and complete structure information. By a thorough experimental study on various real-life data, we demonstrate the efficiency and scalability of SOLARIA over the previous known alternative. SOLARIA is also linearly scalable in terms of XML queries' size.
What problem does this paper attempt to address?