Matthew W. Hoffman,Bobak Shahriari,John Aslanides,Gabriel Barth-Maron,Nikola Momchev,Danila Sinopalnikov,Piotr Stańczyk,Sabela Ramos,Anton Raichuk,Damien Vincent,Léonard Hussenot,Robert Dadashi,Gabriel Dulac-Arnold,Manu Orsini,Alexis Jacq,Johan Ferret,Nino Vieillard,Seyed Kamyar Seyed Ghasemipour,Sertan Girgin,Olivier Pietquin,Feryal Behbahani,Tamara Norman,Abbas Abdolmaleki,Albin Cassirer,Fan Yang,Kate Baumli,Sarah Henderson,Abe Friesen,Ruba Haroun,Alex Novikov,Sergio Gómez Colmenarejo,Serkan Cabi,Caglar Gulcehre,Tom Le Paine,Srivatsan Srinivasan,Andrew Cowie,Ziyu Wang,Bilal Piot,Nando de Freitas

Abstract:Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce published RL algorithms. To address these concerns this work describes Acme, a framework for constructing novel RL algorithms that is specifically designed to enable agents that are built using simple, modular components that can be used at various scales of execution. While the primary goal of Acme is to provide a framework for algorithm development, a secondary goal is to provide simple reference implementations of important or state-of-the-art algorithms. These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research. In this work we describe the major design decisions made within Acme and give further details as to how its components can be used to implement various algorithms. Our experiments provide baselines for a number of common and state-of-the-art algorithms as well as showing how these algorithms can be scaled up for much larger and more complex environments. This highlights one of the primary advantages of Acme, namely that it can be used to implement large, distributed RL algorithms that can run at massive scales while still maintaining the inherent readability of that implementation. This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme.

Resilient Mechanism Against Byzantine Failure for Distributed Deep Reinforcement Learning

Robot Simulation and Reinforcement Learning Training Platform Based on Distributed Architecture.

Exploring the Vulnerability of Deep Reinforcement Learning-based Emergency Control for Low Carbon Power Systems

BR-DeFedRL: Byzantine-Robust Decentralized Federated Reinforcement Learning with Fast Convergence and Communication Efficiency

Data-Based Collaborative Learning for Multiagent Systems under Distributed Denial-of-Service Attacks

On the Foundation of Distributionally Robust Reinforcement Learning

Secure Deep Reinforcement Learning for Dynamic Resource Allocation in Wireless MEC Networks

Byzantine-Resilient Decentralized Collaborative Learning.

Justinian's GAAvernor: Robust Distributed Learning with Gradient Aggregation Agent.

Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation

Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers

Efficient Byzantine-Resilient Stochastic Gradient Desce

Resilience-based post disaster recovery optimization for infrastructure system via Deep Reinforcement Learning

Acme: A Research Framework for Distributed Reinforcement Learning

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies

Efficient Diversity-based Experience Replay for Deep Reinforcement Learning

Enabling Robust DRL-Driven Networking Systems Via Teacher-Student Learning

Byzantine-resilient Decentralized Stochastic Gradient Descent

On the Optimal Batch Size for Byzantine-Robust Distributed Learning

Towards Resilience for Multi-Agent $QD$-Learning