Learned Optimizer for Online Approximate Query Processing in Data Exploration

Liyuan Liu,Hanbing Zhang,Yinan Jing,Zhenying He,Kai Zhang,X. Sean Wang
DOI: https://doi.org/10.1109/tkde.2024.3361989
IF: 9.235
2024-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:In the interactive data exploration, approximate query processing (AQP) can be used to quickly return query results at the cost of accuracy. For online AQP, the sampler can be treated as an operator in the query plan. During the query optimization for AQP, heuristic rules are usually used to guide the sampler push-down. However, due to the complexity and changes of data distribution, the heuristic rule-based optimization methods cannot meet the users' query accuracy requirements. In this paper, we propose a learning-based query optimization method for online AQP. We first introduce the weak equivalence concept and propose a series of push-down rules to guide the sampler push-down during the query optimization. Then, to enable more queries to meet the users' query accuracy requirements, we propose a deep learning model to further optimize the query plan. By using this model during each push-down process of the sampler, we try to avoid the negative effect of inappropriate sampler push-down on query accuracy, especially when there is an inconsistency between the underlying and intermediate data distribution. Extensive experiments show that the method proposed in this paper can outperform the state-of-the-art online sampling-based AQP method by 1.2×-7.9× in query accuracy.
computer science, information systems, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?