Abstract:With the development of deep learning technology, deep reinforcement learning (DRL) has successfully built intelligent agents in sequential decision-making problems through interaction with image-based environments. However, learning from unlimited interaction is impractical and sample inefficient because training an agent requires many trial and error and numerous samples. One response to this problem is sample-efficient DRL, a research area that encourages learning effective state representations in limited interactions with image-based environments. Previous methods could effectively surpass human performance by training an RL agent using self-supervised learning and data augmentation to learn good state representations from a given interaction. However, most of the existing methods only consider similarity of image observations so that they are hard to capture semantic representations. To address these challenges, we propose spatio-temporal and action-based contrastive representation (STACoRe) learning for sample-efficient DRL. STACoRe performs two contrastive learning to learn proper state representations. One uses the agent's actions as pseudo labels, and the other uses spatio-temporal information. In particular, when performing the action-based contrastive learning, we propose a method that automatically selects data augmentation techniques suitable for each environment for stable model training. We train the model by simultaneously optimizing an action-based contrastive loss function and spatio-temporal contrastive loss functions in an end-to-end manner. This leads to improving sample efficiency for DRL. We use 26 benchmark games in Atari 2600 whose environment interaction is limited to only 100k steps. The experimental results confirm that our method is more sample efficient than existing methods. The code is available at https://github.com/dudwojae/STACoRe.

Unsupervised State Representation Learning in Atari

Unsupervised Representation Learning in Partially Observable Atari Games

State Representation Learning Using an Unbalanced Atlas

State Representation Learning for Effective Deep Reinforcement Learning.

Unsupervised Control Through Non-Parametric Discriminative Rewards

State of the Art Control of Atari Games Using Shallow Reinforcement Learning

Learning to Play Atari in a World of Tokens

Reinforcement Learning with Unsupervised Auxiliary Tasks

State Representation Learning with Adjacent State Consistency Loss for Deep Reinforcement Learning.

STACoRe: Spatio-temporal and Action-Based Contrastive Representations for Reinforcement Learning in Atari

Learning Invariant Representations for Reinforcement Learning without Reconstruction

Light-weight probing of unsupervised representations for Reinforcement Learning

Model-Based Reinforcement Learning for Atari

Learning Actionable Representations with Goal-Conditioned Policies

Maximum Manifold Capacity Representations in State Representation Learning

Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments

State Representations as Incentives for Reinforcement Learning Agents: A Sim2Real Analysis on Robotic Grasping

Unsupervised Representation Learning in Deep Reinforcement Learning: A Review

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Bootstrapped Representations in Reinforcement Learning