Abstract:To ensure the delivery of high-performance and reliable services, data center networks (DCNs) are often over-provisioned for peak workload and traffic bursts. However, in real-world data centers, network traffic seldom reaches peak capacity of the network, resulting in significant energy waste. Traditional energy conservation approaches either suffer from high computational complexity and low solution quality, or their strategies cannot be dynamically adjusted to accommodate changes in data center network traffic. Deep reinforcement learning (DRL) provides an effective way to deal with these issues. However, most of the existing DRL-based schemes only consider either a continuous action space or a discrete action space, which greatly restricts the optimality of decisions. To solve these problems, this paper proposes a novel DRL-based DCN energy optimization framework, named SmartDCN. Specifically, SmartDCN consists of a traffic prediction module (TPM) and an energy optimization module (EOM). TPM incorporates an improved LSTM model JANET with an attention mechanism providing a high prediction accuracy, while EOM integrates our newly proposed parameterized DRL algorithm, named PAS-DQN, combining with the discrete-continuous hybrid action space. PAS-DQN implements a two-level control mechanism for the network, using TPM to predict future traffic in the data center as input. It is devoted to dynamically aggregating current traffic and makes tradeoffs between energy efficiency, performance, and robustness to optimize the network's power consumption by dynamically calculating the minimum required network subset and turning off the non-involved network devices to achieve power savings. Experimental results show that SmartDCN significantly outperforms the existing state-of-the-art schemes in terms of energy savings under various network conditions.

DRLCap: Runtime GPU Frequency Capping with Deep Reinforcement Learning

Multi-core Chip Dynamic Power Management Framework Based on Reinforcement Learning br

An Efficient and Flexible Learning Framework for Dynamic Power and Thermal Co-Management

Online Power Management for Multi-Cores: A Reinforcement Learning Based Approach

GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing

Optimizing Data Centre Energy Efficiency via Event-Driven Deep Reinforcement Learning

Deep Reinforcement Learning-Based Power Management for Chiplet-Based Multicore Systems

A Lightweight DRDPG-Based RL DVFS for Video Rendering on CPU-GPU Integrated SoC

Parameterized deep reinforcement learning with hybrid action space for energy efficient data center networks

Real-Time Battery Thermal Management for Electric Vehicles Based on Deep Reinforcement Learning

Modular reinforcement learning for self-adaptive energy efficiency optimization in multicore system

A Framework for Mapping DRL Algorithms with Prioritized Replay Buffer onto Heterogeneous Platforms

A Reinforcement Learning Approach for Performance-aware Reduction in Power Consumption of Data Center Compute Nodes

Multi-Stream Scheduling of Inference Pipelines on Edge Devices - a DRL Approach

ReinFog: A DRL Empowered Framework for Resource Management in Edge and Cloud Computing Environments

Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale

Battery Health-Aware and Deep Reinforcement Learning-Based Energy Management for Naturalistic Data-Driven Driving Scenarios

Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations

Deep Reinforcement Learning for Energy-Efficient on the Heterogeneous Computing Architecture

Model-Free Real-Time Autonomous Energy Management for a Residential Multi-Carrier Energy System: A Deep Reinforcement Learning Approach

Deep Reinforcement Learning Based Energy Management of a Hybrid Electricity-Heat-Hydrogen Energy System with Demand Response