Georg Gottlob,Christoph Koch,Klaus U. Schulz
Abstract:We study the complexity and expressive power of conjunctive queries over unranked labeled trees represented using a variety of structure relations such as ``child'', ``descendant'', and ``following'' as well as unary relations for node labels. We establish a framework for characterizing structures representing trees for which conjunctive queries can be evaluated efficiently. Then we completely chart the tractability frontier of the problem and establish a dichotomy theorem for our axis relations, i.e., we find all subset-maximal sets of axes for which query evaluation is in polynomial time and show that for all other cases, query evaluation is NP-complete. All polynomial-time results are obtained immediately using the proof techniques from our framework. Finally, we study the expressiveness of conjunctive queries over trees and show that for each conjunctive query, there is an equivalent acyclic positive query (i.e., a set of acyclic conjunctive queries), but that in general this query is not of polynomial size.
Databases,Artificial Intelligence,Computational Complexity,Logic in Computer Science
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to study the complexity and expressive power of conjunctive queries on tree structures, especially for unranked labeled trees. Specifically, the paper mainly focuses on the following aspects:
1. **Complexity issues**: Explore how different relationships (such as "child", "descendant", "following", etc.) affect the complexity of conjunctive queries on tree structures. The authors determine which subsets of the largest sets of axis relationships can make query evaluation complete within polynomial time, and prove that for all other cases, query evaluation is NP - complete.
2. **Expressive power issues**: Study the expressive power of conjunctive queries on tree structures, prove that every conjunctive query has an equivalent acyclic positive query, but usually such a query is not of polynomial size.
3. **Solvability boundaries**: Establish a dichotomy theorem regarding axis relationships, that is, find all subsets of the largest sets of axis relationships that make query evaluation complete within polynomial time, and show that for all other cases, query evaluation is NP - complete.
4. **Succinctness issues**: Explore the succinctness of (cyclic) conjunctive queries, prove that there exist certain conjunctive queries whose equivalent acyclic positive queries are not of polynomial size, thus indicating the necessity of scale expansion in the translation process.
### Main contributions
- **Application of X - property**: By introducing the X - property, determine which axis relationships have this property, and accordingly find the types of queries that can be solved in polynomial time.
- **Complexity classification**: Provide a complexity classification table (see Table I) of conjunctive queries on relational structures containing one or two axes, and point out the boundaries of polynomial time and NP - completeness.
- **Expressive power results**: Prove that for every conjunctive query, there exists an equivalent acyclic positive query, although these queries may not be of polynomial size.
- **Succinctness results**: Show that certain conjunctive queries do not have an equivalent acyclic positive query of polynomial size, emphasizing the inevitability of scale expansion in the translation process.
Through these studies, the paper provides a comprehensive theoretical framework for conjunctive queries on tree structures, which not only helps to understand their complexity and expressive power, but also provides guidance for query optimization in practical applications.