Budget-aware Query Tuning: An AutoML Perspective

Wentao Wu,Chi Wang
2024-03-30
Abstract:Modern database systems rely on cost-based query optimizers to come up with good execution plans for input queries. Such query optimizers rely on cost models to estimate the costs of candidate query execution plans. A cost model represents a function from a set of cost units to query execution cost, where each cost unit specifies the unit cost of executing a certain type of query processing operation (such as table scan or join). These cost units are traditionally viewed as constants, whose values only depend on the platform configuration where the database system runs on top of but are invariant for queries processed by the database system. In this paper, we challenge this classic view by thinking of these cost units as variables instead. We show that, by varying the cost-unit values one can obtain query plans that significantly outperform the default query plans returned by the query optimizer when viewing the cost units as constants. We term this cost-unit tuning process "query tuning" (QT) and show that it is similar to the well-known hyper-parameter optimization (HPO) problem in AutoML. As a result, any state-of-the-art HPO technologies can be applied to QT. We study the QT problem in the context of anytime tuning, which is desirable in practice by constraining the total time spent on QT within a given budget -- we call this problem budget-aware query tuning. We further extend our study from tuning a single query to tuning a workload with multiple queries, and we call this generalized problem budget-aware workload tuning (WT), which aims for minimizing the execution time of the entire workload. WT is more challenging as one needs to further prioritize individual query tuning within the given time budget. We propose solutions to both QT and WT and experimental evaluation using both benchmark and real workloads demonstrates the efficacy of our proposed solutions.
Databases,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the problem of how to generate more efficient query execution plans in modern database systems by adjusting the cost unit values of the query optimizer. Specifically, the paper challenges the traditional view of treating cost units as constants and proposes treating these cost units as variables that can be adjusted for each query. Through this adjustment, query performance can be significantly improved. ### Main Research Questions: 1. **Budget-aware Query Tuning (QT)**: Finding the optimal cost unit values within a given time budget to minimize the execution time of a single query. 2. **Budget-aware Workload Tuning (WT)**: Finding the optimal cost unit values within a given time budget to minimize the execution time of the entire workload. ### Research Background: - **Cost Model**: The query optimizer of modern database systems relies on a cost model to estimate the cost of candidate query execution plans. The cost model is a function from a set of cost units to the query execution cost, with each cost unit specifying the unit cost of a certain query processing operation (such as table scan or join). - **Traditional View**: Cost units are typically viewed as constants, with their values depending only on the platform configuration on which the database system runs, but remaining unchanged for the queries being processed. - **New View**: The paper proposes treating cost units as variables and optimizing query execution plans by adjusting these variables. ### Research Methods: - **Query Tuning (QT)**: Similar to the hyperparameter optimization (HPO) problem in automated machine learning (AutoML), existing HPO techniques can be applied to solve the QT problem. - **Workload Tuning (WT)**: More complex, requiring prioritization of multiple queries within the total time budget. The paper proposes four strategies: Round Robin, Cost-based Prioritization, Multi-armed Bandit, and Improvement Rate. ### Experimental Results: - **Query Tuning**: Experimental results show that query performance can be significantly improved in a short time through query tuning. For example, on the JOB workload, query tuning can reduce execution time by 64.2%. - **Workload Tuning**: The paper compares different workload tuning algorithms and finds that Round Robin and Multi-armed Bandit strategies perform best in most cases, achieving high performance improvements in a short time. ### Conclusion: - The paper introduces the problems of budget-aware query tuning and workload tuning and demonstrates the effectiveness of these methods. - Future research directions include further leveraging the similarity between workload queries to improve existing tuning strategies. ### Related Work: - **Autonomous Tuning**: The tuning problem of database systems has attracted extensive research interest, including autonomous knob tuning and index tuning. - **Hyperparameter Optimization**: The HPO problem has been widely studied in the field of automated machine learning, and many existing HPO techniques can be applied to query tuning. In summary, this paper provides a new perspective and method for improving database query performance by introducing dynamic adjustment of cost units.