A Sampling-based Learning Framework for Big Databases

Jingtian Zhang,Sai Wu,Junbo Zhao,Zhongle Xie,Feifei Li,Yusong Gao,Gang Chen
DOI: https://doi.org/10.1145/3485447.3511991
2022-01-01
Abstract:The autonomous database of the next generation aims to apply the reinforcement learning (RL) on tasks like query optimization and performance tuning with little or no human DBAs’ intervention. Despite the promise, to obtain a decent policy model in the domain of database optimization is still challenging — primarily due to the inherent computational overhead involved in the data hungry RL frameworks — in particular on large databases. In the line of mitigating this adverse effect, we propose Mirror in this work. The core to Mirror is a sampling process built in an RL framework together with a transferring process of the policy model from the sampled database to its original counterpart. While being conceptually simple, we identify that the policy transfer between databases involves heavy noise and prediction drifting that cannot be neglectable. Thereby we build a theoretical-guided sampling algorithm in Mirror assisted by a continuous fine-tuning module. The experiments on the PostgreSQL and an industry database PolarDB validate that Mirror has effectively reduced the computational cost while maintaining a satisfactory performance.
What problem does this paper attempt to address?