Abstract:Existing algorithms of mining frequent XML query patterns (XQPs) employ a candidate generate-and-test strategy. They involve expensive candidate enumeration and costly tree-containment checking. Further, most of existing methods compute the frequencies of candidate query patterns from scratch periodically by checking the entire transaction database, which consists of XQPs transferred from user query logs. However, it is not straightforward to maintain such discovered frequent patterns in real XML databases as there may be frequent updates that may not only invalidate some existing frequent query patterns but also generate some new frequent query patterns. Therefore, a drawback of existing methods is that they are rather inefficient for the evolution of transaction databases. To address above-mentioned problems, this paper proposes an efficient algorithm ESPRIT to mine frequent XQPs without costly tree-containment checking. ESPRIT transforms XML queries into sequences using a one-to-one mapping technique and mines the frequent sequences to generate frequent XQPs. We propose two efficient incremental algorithms, ESPRIT-i and ESPRIT-i (+), to incrementally mine frequent XQPs. We devise several novel optimization techniques of query rewriting, cache lookup, and cache replacement to improve the answerability and the hit rate of caching. We have implemented our algorithms and conducted a set of experimental studies on various datasets. The experimental results demonstrate that our algorithms achieve high efficiency and scalability and outperform state-of-the-art methods significantly.

QReduction: Synopsizing XPath Query Set Efficiently under Resource Constraint

An Extensible Framework for XPath Query Minimization

X2S: Translating XPath into Efficient SQL Queries

Forward XPath Rewriting over XML Data Streams

Improving Xml Querying with Maximal Frequent Query Patterns

Indexing Techniques For Query Of Xml Documents

Optimizing XML querying using type-based document projection

Using XML Structure to Reduce Candidate Nodes Participated in Query Processing

Exploit Sequencing to Accelerate Hot XML Query Pattern Mining

Minimization of XML Tree Pattern Queries under DTD Constraints.

Optimized Query Translation Strategy for XML Stored in Relational Database

Approximate top-k structural similarity search over XML documents

A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation

Incremental Sequence-Based Frequent Query Pattern Mining from XML Queries.

Application of XML Tree with Incomplete Information in Query Efficiency Improvement

XPath Logical Optimization Based on DTD

Rewriting XQuery to Avoid Redundant Expressions based on Static Emulation of XML Store

Distributed XPath Query Processing over Large XML Data Based on MapReduce Framework

Bottom-Up Mining of Xml Query Patterns to Improve Xml Querying

A Caching System for XML Queries Using Frequent Query Patterns

Exploit sequencing to accelerate XML twig query answering