Efficient Cloud Cluster Resource Scheduling with Deep Reinforcement Learning

Xiaochang Liu,Bei Dong,Fei Hao
DOI: https://doi.org/10.1109/cbd63341.2023.00015
2023-01-01
Abstract:Resource scheduling is a fundamental problem in various computer application scenarios, and it has received extensive attention especially in the field of cluster resource scheduling in cloud computing. Traditional algorithms that solved the cluster resource scheduling problem in the past were deficient in aspects such as real-time and adaptability, and the latest proposed reinforcement learning methods also suffer from slow convergence speed. To this end, this paper proposes an efficient deep reinforcement learning (DRL) method by using the Synchronous Advantage Actor-Critic (A2C) algorithm combined with the Generalized Advantage Estimator (GAE) - to solve the cluster resource scheduling problem. The A2C enables efficient synchronous updates, facilitating stable and rapid learning. Furthermore, the inclusion of the GAE allows for an accurate approximation of the advantage function, improving the policy gradient estimates. Experimental results show that our approach can successfully train the agent to perform efficient scheduling in the cluster resource scheduling task, resulting in an improvement of up to 40% compared to heuristic algorithms and up to 4.5% compared to previously proposed deep reinforcement learning algorithms.
What problem does this paper attempt to address?