Energy-aware Task Scheduling Optimization with Deep Reinforcement Learning for Large-Scale Heterogeneous Systems

Li Jingbo,Zhang Xingjun,Wei Zheng,Wei Jia,Ji Zeyu
DOI: https://doi.org/10.1007/s42514-021-00083-8
2021-01-01
CCF Transactions on High Performance Computing
Abstract:The energy consumption of large-scale heterogeneous computing systems has become a critical concern on both financial and environmental fronts. Current systems employ hand-crafted heuristics and ignore changes in the system and workload characteristics. Moreover, high-dimensional state and action problems cannot be solved efficiently using traditional reinforcement learning-based methods in large-scale heterogeneous settings. Therefore, in this paper, energy-aware task scheduling with deep reinforcement learning (DRL) is proposed. First, based on the real data set SPECpower, a high-precision energy consumption model, convenient for environmental simulation, is designed. Based on the actual production conditions, a partition-based task-scheduling algorithm using proximal policy optimization on heterogeneous resources is proposed. Simultaneously, an auto-encoder is used to process high-dimensional space to speed up DRL convergence. Finally, to fully verify our algorithm, three scheduling scenarios containing large, medium, and small-scale heterogeneous environments are simulated. Experiments show that when compared with heuristics and DRL-based methods, our algorithm more effectively reduces system energy consumption and ensures the quality of service, without significantly increasing the waiting time.
What problem does this paper attempt to address?