A Novel Technique for Query Plan Representation Based on Graph Neural Nets

Baoming Chang,Amin Kamali,Verena Kantere
2024-06-05
Abstract:Learning representations for query plans play a pivotal role in machine learning-based query optimizers of database management systems. To this end, particular model architectures are proposed in the literature to transform the tree-structured query plans into representations with formats learnable by downstream machine learning models. However, existing research rarely compares and analyzes the query plan representation capabilities of these tree models and their direct impact on the performance of the overall optimizer. To address this problem, we perform a comparative study to explore the effect of using different state-of-the-art tree models on the optimizer's cost estimation and plan selection performance in relatively complex workloads. Additionally, we explore the possibility of using graph neural networks (GNNs) in the query plan representation task. We propose a novel tree model BiGG employing Bidirectional GNN aggregated by Gated recurrent units (GRUs) and demonstrate experimentally that BiGG provides significant improvements to cost estimation tasks and relatively excellent plan selection performance compared to the state-of-the-art tree models.
Databases,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily focuses on the issue of query optimization in database management systems, specifically on how to effectively represent query plans to improve the performance of cost estimation and plan selection. Specifically: 1. **Comparative Study of Query Plan Representation Capabilities**: - Current research rarely directly compares and analyzes the effectiveness of different tree models in representing query plans and their impact on the overall optimizer performance. This paper conducts a comparative study through complex experimental workloads, exploring the performance of different state-of-the-art tree models in cost estimation and plan selection tasks. 2. **Exploration of Graph Neural Networks (GNN) Applications**: - Although GNNs have achieved success in other graph-structured domains, their application in query plan representation is insufficient. This paper explores the possibility of using GNNs for query plan representation and proposes a new tree model based on bidirectional GNN and GRU aggregation methods, BiGG, demonstrating significant improvements in cost estimation tasks and excellent performance in plan selection tasks. 3. **Key Challenges in Tree Model Design**: - Existing tree models face two main challenges when handling query plans: information dilution (loss of information during transmission from leaf nodes to the root node) and structural information retention (maintaining structural information while aggregating information from the entire graph). This paper proposes a new model to address these issues. In summary, this paper aims to enhance the accuracy of cost estimation and the capability of plan selection in query optimizers by comparing the performance of existing tree models and introducing a new GNN-based model.