Extremal Separation Problems for Temporal Instance Queries

Jean Christoph Jung,Vladislav Ryzhikov,Frank Wolter,Michael Zakharyaschev
2024-06-07
Abstract:The separation problem for a class Q of database queries is to find a query in Q that distinguishes between a given set of `positive' and `negative' data examples. Separation provides explanations of examples and underpins the query-by-example paradigm to support database users in constructing and refining queries. As the space of all separating queries can be large, it is helpful to succinctly represent this space by means of its most specific (logically strongest) and general (weakest) members. We investigate this extremal separation problem for classes of instance queries formulated in linear temporal logic LTL with the operators conjunction, next, and eventually. Our results range from tight complexity bounds for verifying and counting extremal separators to algorithms computing them.
Databases,Logic in Computer Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is about the extremal separation problems in temporal instance queries. Specifically, it focuses on how to find queries that can distinguish positive and negative data examples in given positive and negative data examples. This problem is of great significance for supporting database users to construct and optimize queries through examples. The paper mainly studies the extremal separation problems of the temporal instance query category using the "conjunction", "next" and "eventually" operators in Linear Temporal Logic (LTL). ### Specific Description of the Problem The "separation problem" refers to finding a query \(q\in Q\) in a query class \(Q\) such that for a given set of positive and negative data examples \(E=(E^{+}, E^{-})\), all positive examples \(D\in E^{+}\) satisfy \(D\models q\), while all negative examples \(D\in E^{-}\) do not satisfy \(D\models q\). Such separation queries can interpret positive and negative data examples and support the query - by - example paradigm, helping users construct and optimize queries. ### Research Content The paper studies the extremal separation problems in the LTL query class, especially those temporal instance queries using "conjunction", "next" and "eventually" operators. Specifically, it includes: 1. **Verifying and Counting the Most Specific and the Most General Separation Queries**: The paper provides complexity results and proves the complexity of verifying and counting these extremal separation queries. 2. **Calculating the Most Specific and the Most General Separation Queries**: The paper proposes algorithms to calculate these extremal separation queries. 3. **Complexity of Logical Implication**: It studies the logical implication relationships between queries and their complexity. 4. **Weakening and Strengthening Boundaries**: It explores how to understand the space of separation queries by weakening and strengthening boundaries. ### Main Contributions - **Complexity Results**: The paper provides tight complexity bounds for verifying and counting the most specific and the most general separation queries. - **Algorithm Design**: It proposes algorithms for calculating extremal separation queries within polynomial time. - **Theoretical Analysis**: By combining logical and automata methods, as well as pattern - matching techniques, it conducts an in - depth analysis of the separation query space. ### Application Background The research on these problems is not only of great significance for database query optimization, but can also be applied to fields such as automatic feature extraction and classifier engineering. In addition, separation queries provide a basis for interpreting positive and negative data examples provided by applications. In summary, this paper aims to systematically study the extremal separation problems in temporal instance queries, provide theoretical analysis and algorithm design to support more effective query construction and optimization.