Abstract:The promotion of large-scale applications of reinforcement learning (RL) requires efficient training computation. While existing parallel RL frameworks encompass a variety of RL algorithms and parallelization techniques, the excessively burdensome communication frameworks hinder the attainment of the hardware's limit for final throughput and training effects on a single desktop. In this paper, we propose Spreeze, a lightweight parallel framework for RL that efficiently utilizes a single desktop hardware resource to approach the throughput limit. We asynchronously parallelize the experience sampling, network update, performance evaluation, and visualization operations, and employ multiple efficient data transmission techniques to transfer various types of data between processes. The framework can automatically adjust the parallelization hyperparameters based on the computing ability of the hardware device in order to perform efficient large-batch updates. Based on the characteristics of the "Actor-Critic" RL algorithm, our framework uses dual GPUs to independently update the network of actors and critics in order to further improve throughput. Simulation results show that our framework can achieve up to 15,000Hz experience sampling and 370,000Hz network update frame rate using only a personal desktop computer, which is an order of magnitude higher than other mainstream parallel RL frameworks, resulting in a 73% reduction of training time. Our work on fully utilizing the hardware resources of a single desktop computer is fundamental to enabling efficient large-scale distributed RL training.

SpeedyZero: Mastering Atari with Limited Data and Time

S peedy z ero : m astering a tari with l imited d ata and t ime

Mastering Atari Games with Limited Data

EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Become a Proficient Player with Limited Data through Watching Pure Videos

Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning

Mastering Atari with Discrete World Models

FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing

Beyond The Rainbow: High Performance Deep Reinforcement Learning On A Desktop PC

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference

Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

Model-Based Reinforcement Learning for Atari

Sample Efficient Reinforcement Learning Using Graph-Based Memory Reconstruction.

ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze

Pre-training with Non-expert Human Demonstration for Deep Reinforcement Learning

State of the Art Control of Atari Games Using Shallow Reinforcement Learning

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores

Accelerating Robot Reinforcement Learning with Samples of Different Simulation Precision.