Abstract:In mobile edge computing (MEC), resource scheduling is crucial to task requests' performance and service providers' cost, involving multi-layer heterogeneous scheduling decisions. Existing schedulers typically adopt static timescales to regularly update scheduling decisions of each layer, without adaptive adjustment of timescales for different layers, resulting in potentially poor performance in practice. We notice that the adaptive timescales would significantly improve the trade-off between the operation cost and delay performance. Based on this insight, we propose EdgeTimer, the first work to automatically generate adaptive timescales to update multi-layer scheduling decisions using deep reinforcement learning (DRL). First, EdgeTimer uses a three-layer hierarchical DRL framework to decouple the multi-layer decision-making task into a hierarchy of independent sub-tasks for improving learning efficiency. Second, to cope with each sub-task, EdgeTimer adopts a safe multi-agent DRL algorithm for decentralized scheduling while ensuring system reliability. We apply EdgeTimer to a wide range of Kubernetes scheduling rules, and evaluate it using production traces with different workload patterns. Extensive trace-driven experiments demonstrate that EdgeTimer can learn adaptive timescales, irrespective of workload patterns and built-in scheduling rules. It obtains up to 9.1x more profit than existing approaches without sacrificing the delay performance.

What problem does this paper attempt to address?

The paper primarily focuses on addressing the resource scheduling problem in Mobile Edge Computing (MEC) scenarios, particularly the adaptive time scale update strategy for multi-layer heterogeneous scheduling decisions. ### Research Background and Problem Definition In mobile edge computing, resource scheduling is crucial for the performance of task requests and the cost to service providers. Existing schedulers typically use a static time scale to periodically update the scheduling decisions at each layer, without achieving adaptive adjustments between different layers, which may lead to suboptimal performance in practical applications. ### Solution Overview The paper proposes a method called EdgeTimer, which is the first attempt to automatically generate adaptive time scales for multi-layer scheduling decisions using deep reinforcement learning techniques. Specifically: 1. **Adaptive Time Scale**: EdgeTimer learns the update frequency of scheduling decisions at different layers through deep reinforcement learning to achieve the optimal trade-off between operational cost and latency performance. 2. **Hierarchical Decomposition**: A three-layer hierarchical deep reinforcement learning framework is used to decompose the multi-layer decision task into a series of independent sub-tasks to improve learning efficiency. 3. **Decentralized Scheduling**: To handle each sub-task, EdgeTimer employs a secure multi-agent deep reinforcement learning algorithm for decentralized scheduling, ensuring system reliability. ### Main Contributions - **Adaptivity**: The time scale for each layer's scheduling decision is variable and can adapt to changes in online task request patterns. - **Asynchrony**: Allows lower-layer decisions to remain unchanged when higher-layer decisions are updated. - **Autonomy**: Each edge server can determine its own time scale based on local information, reducing communication costs from cloud to edge. ### Technical Details - **Application of Deep Reinforcement Learning**: Used to learn the update strategy, i.e., to decide whether to update the current scheduling decisions at each layer at each time point. - **Hierarchical Deep Reinforcement Learning Framework**: Decomposes the overall learning task into three independent sub-tasks, each handled by a deep reinforcement learning controller, significantly reducing task complexity. - **Secure Multi-Agent Deep Reinforcement Learning**: To address the non-stationary environment problem, a secure multi-agent deep reinforcement learning method is designed to ensure the safety of online decisions. ### Experimental Validation - EdgeTimer was integrated into a real-world level simulator and evaluated using workload traces from Alibaba's production cluster. - Under 45 typical scheduling rules, experimental results show that EdgeTimer can significantly improve the profit of service providers without sacrificing the latency performance of task requests compared to existing schedulers. ### Conclusion EdgeTimer optimizes resource scheduling in mobile edge computing scenarios by introducing adaptive, asynchronous, and autonomous time scale update mechanisms, bringing direct profit improvements to service providers while being non-intrusive to existing scheduling methods.

EdgeTimer: Adaptive Multi-Timescale Scheduling in Mobile Edge Computing with Deep Reinforcement Learning