An Improving List Scheduling Algorithm Based on Reinforcement Learning and Task Duplication

Zhi Wang,Hancong Duan,Yamin Cheng
DOI: https://doi.org/10.1145/3579654.3579657
2022-01-01
Abstract:Task scheduling plays an important role in query execution, which affects the response time and system throughput of queries. Current database systems use simple heuristic algorithms to determine the order of scheduled tasks and executor allocation. This makes it hard for the scheduler to make full use of the characteristics of the task graph and the state information of executors to optimize the scheduling process dynamically. Especially in heterogeneous environments, it is difficult for heuristic algorithms to generate a better sequence for task execution and balance loads of executors. To address these challenges, we design a DAG scheduler based on graph attention network and reinforcement learning to make scheduling decisions. At first, the scheduler extracts features of each node in the DAG through the graph attention network and utilizes an LSTM module to obtain the high-level representations. Then, the RL agent calculates the probabilities by these representations and selects the node with the maximum probability for each step. Finally, the parameters of the agent will be updated until all the nodes of a DAG have been scheduled. The experimental results based on random DAGs and TPC-H workload reveal that the proposed model can outperform the existing heuristic algorithms by 30% at most on average makespan, which also has significant improvements on TPC-H workload.
What problem does this paper attempt to address?