XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Alexander Nikulin,Ilya Zisman,Alexey Zemtsov,Viacheslav Sinii,Vladislav Kurenkov,Sergey Kolesnikov

2024-06-13

Abstract:Following the success of the in-context learning paradigm in large-scale language and computer vision models, the recently emerging field of in-context reinforcement learning is experiencing a rapid growth. However, its development has been held back by the lack of challenging benchmarks, as all the experiments have been carried out in simple environments and on small-scale datasets. We present \textbf{XLand-100B}, a large-scale dataset for in-context reinforcement learning based on the XLand-MiniGrid environment, as a first step to alleviate this problem. It contains complete learning histories for nearly $30,000$ different tasks, covering $100$B transitions and $2.5$B episodes. It took $50,000$ GPU hours to collect the dataset, which is beyond the reach of most academic labs. Along with the dataset, we provide the utilities to reproduce or expand it even further. With this substantial effort, we aim to democratize research in the rapidly growing field of in-context reinforcement learning and provide a solid foundation for further scaling. The code is open-source and available under Apache 2.0 licence at <a class="link-external link-https" href="https://github.com/dunno-lab/xland-minigrid-datasets" rel="external noopener nofollow">this https URL</a>.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

This paper presents a large-scale multi-task dataset named XLand-100B for In-Context Reinforcement Learning. Currently, despite progress in language and computer vision domains, the lack of challenging benchmark datasets has hindered the development of In-Context Reinforcement Learning. XLand-100B consists of learning histories from nearly 30,000 different tasks, covering 100 billion transitions and 2.5 billion episodes. The collection of this dataset required 50,000 GPU hours, surpassing the capacity of most academic labs. The paper points out that existing reinforcement learning datasets have limited task quantities and are not suitable for training models capable of context learning. XLand-100B aims to promote research in this field by providing a large number of diverse and complex tasks, and lay the foundation for future larger-scale expansions. The dataset is compatible with various In-Context Reinforcement Learning methods and provides tools for replicating or extending the dataset. The paper also introduces two In-Context Reinforcement Learning methods: Algorithm Distillation (AD) and Decision Pretraining Transformer (DPT), and discusses the data collection process, data format, data quality, and applicability. Through experiments, they found that the current methods still have room for improvement in complex tasks, indicating that there is still a significant amount of research needs in this field. Overall, this paper addresses the lack of large-scale and diverse datasets in the field of In-Context Reinforcement Learning, aiming to drive research in this field and promote the generalization ability of models.

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

In-Context Reinforcement Learning for Variable Action Spaces

VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning

LLMs Are In-Context Reinforcement Learners

LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Few-shot In-Context Preference Learning Using Large Language Models

In-Context Learning with Long-Context Models: An In-Depth Exploration

XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs

NAVIX: Scaling MiniGrid Environments with JAX

$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens

NarrativeXL: A Large-scale Dataset For Long-Term Memory Models

In-Context Learning for Text Classification with Many Labels

Benchmarking Deep Reinforcement Learning for Continuous Control

LAION-5B: An open large-scale dataset for training next generation image-text models

Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning

Large Language Models Know What Makes Exemplary Contexts

Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding