Complex event recognition meets hierarchical conjunctive queries

Dante Pinto,Cristian Riveros
2024-08-03
Abstract:Hierarchical conjunctive queries (HCQ) are a subclass of conjunctive queries (CQ) with robust algorithmic properties. Among others, Berkholz, Keppeler, and Schweikardt have shown that HCQ is the subclass of CQ (without projection) that admits dynamic query evaluation with constant update time and constant delay enumeration. On a different but related setting stands Complex Event Recognition (CER), a prominent technology for evaluating sequence patterns over streams. Since one can interpret a data stream as an unbounded sequence of inserts in dynamic query evaluation, it is natural to ask to which extent CER can take advantage of HCQ to find a robust class of queries that can be evaluated efficiently. In this paper, we search to combine HCQ with sequence patterns to find a class of CER queries that can get the best of both worlds. To reach this goal, we propose a class of complex event automata model called Parallelized Complex Event Automata (PCEA) for evaluating CER queries with correlation (i.e., joins) over streams. This model allows us to express sequence patterns and compare values among tuples, but it also allows us to express conjunctions by incorporating a novel form of non-determinism that we call parallelization. We show that for every HCQ (under bag semantics), we can construct an equivalent PCEA. Further, we show that HCQ is the biggest class of acyclic CQ that this automata model can define. Then, PCEA stands as a sweet spot that precisely expresses HCQ (i.e., among acyclic CQ) and extends them with sequence patterns. Finally, we show that PCEA also inherits the good algorithmic properties of HCQ by presenting a streaming evaluation algorithm under sliding windows with logarithmic update time and output-linear delay for the class of PCEA with equality predicates.
Databases,Data Structures and Algorithms,Formal Languages and Automata Theory
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to combine **Hierarchical Conjunctive Queries (HCQ)** with **Complex Event Recognition (CER)** to find a class of CER queries that can be evaluated efficiently. Specifically, the author hopes to use the good algorithmic properties of HCQ to extend CER queries so that they can handle sequence patterns while maintaining efficiency. #### Main problem description 1. **Background and motivation**: - **Hierarchical Conjunctive Queries (HCQ)**: It is a subclass of conjunctive queries (CQ) with good dynamic query evaluation characteristics. Berkholz et al. proved that HCQ is the largest CQ subclass that can be dynamically query - evaluated with constant update time and constant delay enumeration. - **Complex Event Recognition (CER)**: It is a technique for evaluating sequence patterns on data streams. Since data streams can be regarded as unbounded insertion sequences in dynamic query evaluation, it becomes natural and important to study whether CER can benefit from HCQ. 2. **Research objectives**: - The author hopes to find a class of CER queries by combining HCQ and sequence patterns, so that these queries can maintain the efficiency of HCQ and handle complex sequence patterns. - Specifically, they proposed a new model called **Parallelized Complex Event Automaton (PCEA)**, which can evaluate CER queries with correlations (i.e., join operations) on data streams. 3. **Key challenges**: - How to extend HCQ to support sequence patterns while maintaining efficiency. - How to design an automaton model (such as PCEA) so that it can express HCQ and extend its functions to handle CER queries. 4. **Solutions**: - Proposed the PCEA model, which introduces a parallelization mechanism, allowing multiple independent execution paths to run simultaneously and converge them when reading new data items. - Proved that PCEA can express all hierarchical conjunctive queries (HCQ) and is the largest acyclic CQ subclass that can be defined by this automaton model. - Showed that PCEA inherits the good algorithmic properties of HCQ and provides a streaming evaluation algorithm with logarithmic update time and linear output delay under a sliding window. ### Summary The core problem of the paper is to explore how to combine the advantages of HCQ and CER to find a class of CER queries that are both efficient and powerful. By proposing the Parallelized Complex Event Automaton (PCEA), the author has successfully achieved this goal and demonstrated the potential of PCEA in handling complex event recognition tasks.