XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

Alexander Nikulin,Vladislav Kurenkov,Ilya Zisman,Artem Agarkov,Viacheslav Sinii,Sergey Kolesnikov

2024-06-10

Abstract:Inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid, we present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research. Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with the environments, XLand-MiniGrid provides pre-sampled benchmarks with millions of unique tasks of varying difficulty and easy-to-use baselines that allow users to quickly start training adaptive agents. In addition, we have conducted a preliminary analysis of scaling and generalization, showing that our baselines are capable of reaching millions of steps per second during training and validating that the proposed benchmarks are challenging.

Machine Learning

What problem does this paper attempt to address?

This paper introduces XLand-MiniGrid, a JAX-based meta reinforcement learning (RL) environment toolkit aimed at addressing the issues of low sample efficiency and overfitting in RL. By employing meta reinforcement learning methods, agents can be pre-trained on various task distributions to improve their sample efficiency on new problems. However, current meta reinforcement learning methods require a large number of different tasks for pre-training, which may be infeasible for research labs and practitioners with limited resources. XLand-MiniGrid combines the complexity of XLand with the simplicity of MiniGrid to create a scalable rule and objective system that generates diverse task distributions. It is designed to be highly scalable and can run on GPU or TPU accelerators, making large-scale experiments easier. Furthermore, the paper provides pre-sampled benchmark tests consisting of millions of unique tasks, as well as user-friendly baseline algorithms for quick training of adaptive agents. The paper also performs initial scalability and generalization analysis, demonstrating that the proposed benchmark is challenging and that baseline algorithms can achieve speeds of millions of steps per second during training. However, there is still room for improvement in existing baselines, particularly in terms of generalization capabilities for new tasks. In summary, XLand-MiniGrid is an open-source meta reinforcement learning research library aimed at facilitating large-scale experiments and reducing resource constraints to drive research into the boundaries and scalability of reinforcement learning algorithms.

XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

NAVIX: Scaling MiniGrid Environments with JAX

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

minimax: Efficient Baselines for Autocurricula in JAX

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

JaxMARL: Multi-Agent RL Environments in JAX

NovGrid: A Flexible Grid World for Evaluating Agent Response to Novelty

Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement Learning

GridToPix: Training Embodied Agents with Minimal Supervision

PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators

Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds

Accelerating Goal-Conditioned RL Algorithms and Research

Massively Multiagent Minigames for Training Generalist Agents

IGLU Gridworld: Simple and Fast Environment for Embodied Dialog Agents

Discovering Minimal Reinforcement Learning Environments

Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI

Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning

High Performance Simulation for Scalable Multi-Agent Reinforcement Learning

marl-jax: Multi-Agent Reinforcement Leaning Framework

Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning