Abstract:Graph optimization problems (such as minimum vertex cover, maximum cut, travelling salesman problems) appear in many fields including social sciences, power systems, chemistry, and bioinformatics. Recently, deep reinforcement learning (DRL) has shown success in automatically learning good heuristics to solve graph optimization problems. However, the existing RL systems either do not support graph RL environments or do not support multiple or many GPUs in a distributed setting. This has compromised the ability of reinforcement learning in solving large-scale graph optimization problems due to lack of parallelization and high scalability. To address the challenges of parallelization and scalability, we develop RL4GO , a high performance distributed-GPU DRL framework for solving graph optimization problems. RL4GO focuses on a class of computationally demanding RL problems, where both RL environment and the policy model are highly computation intensive. Traditional reinforcement learning systems often assume either the RL environment is of low time-complexity or policy model is small. In this work, we distribute large-scale graphs across distributed GPUs, and use the spatial parallelism and data parallelism to achieve scalable performance. We compare and analyze the performance of the spatial parallelism and data parallelism, and show their differences. To support graph neural network (GNN) layers that take as input data samples partitioned across distributed GPUs, we design parallel mathematical kernels to perform operations on distributed 3D sparse and 3D dense tensors. To handle costly RL environments, we design a parallel graph environment to scale up all RL-environment related operations. By combining the scalable GNN layers with the scalable RL environment, we are able to develop high performance RL4GO training and inference algorithms in parallel. Furthermore, we propose two optimization techniques—replay buffer on-the-fly graph generation and adaptive multiple-node selection—to minimize the spatial cost and accelerate reinforcement learning. This work also conducts in-depth analyses of parallel efficiency and memory cost, and shows that the designed RL4GO algorithms are scalable on numerous distributed GPUs. Evaluations on large-scale graphs show that 1) RL4GO training and inference can achieve good parallel efficiency on 192 GPUs; 2) its training time can be 18 times faster than the state-of-the-art Gorila distributed RL framework [34]; and 3) its inference performance achieves a 26 times improvement over Gorila.

MSRL: Distributed Reinforcement Learning with Dataflow Fragments

S2rl

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores

A Framework for Mapping DRL Algorithms with Prioritized Replay Buffer onto Heterogeneous Platforms

HybridFlow: A Flexible and Efficient RLHF Framework

Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions

Federated Ensemble Model-Based Reinforcement Learning in Edge Computing

FLIRRAS: Fast Learning With Integrated Reward and Reduced Action Space for Online Multitask Offloading

A Distributed-GPU Deep Reinforcement Learning System for Solving Large Graph Optimization Problems

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

Reinforcement and transfer learning for distributed analytics in fragmented software defined coalitions

Reinforcement Learning for Load-balanced Parallel Particle Tracing

An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training

Feudal Graph Reinforcement Learning

ReinFog: A DRL Empowered Framework for Resource Management in Edge and Cloud Computing Environments

Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments

Solving non-permutation flow-shop scheduling problem via a novel deep reinforcement learning approach

Federated Multiagent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multimicrogrid Energy Management

DistFlow Safe Reinforcement Learning Algorithm for Voltage Magnitude Regulation in Distribution Networks

Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management