Optimal Policy Characterization Enhanced Proximal Policy Optimization for Multitask Scheduling in Cloud Computing

Jiangliang Jin,Yunjian Xu
DOI: https://doi.org/10.1109/jiot.2021.3111414
IF: 10.6
2022-05-01
IEEE Internet of Things Journal
Abstract:For a serving system with multiple servers and a public queue, we study the scheduling of multiple tasks with deadlines, under random task arrivals and renewable energy generation. To minimize the weighted sum of the serving cost (associated with the energy consumption) and the delay cost (resulting from deferring the processing of tasks after their deadlines), we formulate the problem as a dynamic program with unknown transition probability. To mitigate the curse of dimensionality, we establish a partial priority rule, the earlier deadline and less demand first (ED-LDF): priority should be given to tasks with earlier deadline and less demand. In the heavy-traffic regime, the established ED-LDF characterization is proved to be optimal under arbitrary system dynamics. We propose a new, scalable ED-LDF-based proximal policy optimization (PPO) approach that integrates our (partial) optimal policy characterizations into the state-of-the-art deep reinforcement learning (DRL) algorithm. Numerical results demonstrate that the proposed ED-LDF-based PPO approach outperforms the classical PPO and three other priority rule-based PPO approaches.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?