Abstract:As the quantity and complexity of information processed by software systems increase, large-scale software systems have an increasing requirement for high-performance distributed computing systems. With the acceleration of the Internet in Web 2.0, Cloud computing as a paradigm to provide dynamic, uncertain and elastic services has shown superiorities to meet the computing needs dynamically. Without an appropriate scheduling approach, extensive Cloud computing may cause high energy consumptions and high cost, in addition that high energy consumption will cause massive carbon dioxide emissions. Moreover, inappropriate scheduling will reduce the service life of physical devices as well as increase response time to users' request. Hence, efficient scheduling of resource or optimal allocation of request, that usually a NP-hard problem, is one of the prominent issues in emerging trends of Cloud computing. Focusing on improving quality of service (QoS), reducing cost and abating contamination, researchers have conducted extensive work on resource scheduling problems of Cloud computing over years. Nevertheless, growing complexity of Cloud computing, that the super-massive distributed system, is limiting the application of scheduling approaches. Machine learning, a utility method to tackle problems in complex scenes, is used to resolve the resource scheduling of Cloud computing as an innovative idea in recent years. Deep reinforcement learning (DRL), a combination of deep learning (DL) and reinforcement learning (RL), is one branch of the machine learning and has a considerable prospect in resource scheduling of Cloud computing. This paper surveys the methods of resource scheduling with focus on DRL-based scheduling approaches in Cloud computing, also reviews the application of DRL as well as discusses challenges and future directions of DRL in scheduling of Cloud computing.

DRS: A deep reinforcement learning enhanced Kubernetes scheduler for microservice‐based system

DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters

DRPC: Distributed Reinforcement Learning Approach for Scalable Resource Provisioning in Container-based Clusters

SCHED²: Scheduling Deep Learning Training Via Deep Reinforcement Learning.

An Integrated Dynamic Resource Scheduling Framework in On-Demand Clouds.

An Integrated Dynamic Resource Scheduling Framework in On-Demand Clouds

On a Meta Learning-based Scheduler for Deep Learning Clusters

A2C-DRL: Dynamic Scheduling for Stochastic Edge-Cloud Environments Using A2C and Deep Reinforcement Learning

An Improved Kubernetes Scheduling Algorithm for Deep Learning Platform

Smart-DRS: A Strategy of Dynamic Resource Scheduling in Cloud Data Center

RLScheduler: An Automated HPC Batch Job Scheduler Using Reinforcement Learning

Dynamic scheduling for flexible job shop using a deep reinforcement learning approach

Optimization of Task-Scheduling Strategy in Edge Kubernetes Clusters Based on Deep Reinforcement Learning

Dynamic scheduling of decentralized high-end equipment R&D projects via deep reinforcement learning

DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster Scheduling

Deep reinforcement learning for dynamic distributed job shop scheduling problem with transfers

Deep reinforcement learning-based resource scheduling for energy optimization and load balancing in SDN-driven edge computing

A deep reinforcement learning-based optimization method for long-running applications container deployment

Fast DRL-based scheduler configuration tuning for reducing tail latency in edge-cloud jobs

Deep Reinforcement Learning for Multi-Resource Multi-Machine Job Scheduling

Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions