Abstract:Conventional reinforcement learning (RL) algorithms exhibit broad generality in their theoretical formulation and high performance on several challenging domains when combined with powerful function approximation. However, developing RL algorithms that perform well across problems with unstructured observations at scale remains challenging because most function approximation methods rely on externally provisioned knowledge about the structure of the input for good performance (e.g. convolutional networks, graph neural networks, tile-coding). A common practice in RL is to evaluate algorithms on a single problem, or on problems with limited variation in the observation scale. RL practitioners lack a systematic way to study how well a single RL algorithm performs when instantiated across a range of problem scales, and they lack function approximation techniques that scale well with unstructured observations. We address these limitations by providing environments and algorithms to study scaling for unstructured observation vectors and flat action spaces. We introduce a family of combinatorial RL problems with an exponentially large state space and high-dimensional dynamics but where linear computation is sufficient to learn a (nonlinear) value function estimate for performant control. We provide an algorithm that constructs reward-relevant general value function (GVF) questions to find and exploit predictive structure directly from the experience stream. In an empirical evaluation of the approach on synthetic problems, we observe a sample complexity that scales linearly with the observation size. The proposed algorithm reliably outperforms a conventional deep RL algorithm on these scaling problems, and they exhibit several desirable auxiliary properties. These results suggest new algorithmic mechanisms by which algorithms can learn at scale from unstructured data.

Algorithms or Actions? A Study in Large-Scale Reinforcement Learning

Stochastic Q-learning for Large Discrete Action Spaces

Towards model-free RL algorithms that scale well with unstructured data

Reinforcement Learning Algorithms: An Overview and Classification

Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning

Offline Reinforcement Learning With Combinatorial Action Spaces

How to Choose a Reinforcement-Learning Algorithm

On the Role of the Action Space in Robot Manipulation Learning and Sim-to-Real Transfer

A Survey of Reinforcement Learning Algorithms for Dynamically Varying Environments

Reinforcement Learning Algorithms: A brief survey

Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation

Learning Agents With Prioritization and Parameter Noise in Continuous State and Action Space

Generalizing soft actor-critic algorithms to discrete action spaces

Inverse Reinforcement Learning in Large State Spaces via Function Approximation

Empowering Large Language Model Agents through Action Learning

Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games

Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory

Satisficing Exploration for Deep Reinforcement Learning

Learning Purposeful Behaviour in the Absence of Rewards

Learning Action Representations for Reinforcement Learning

A Comparison of Action Spaces for Learning Manipulation Tasks