Abstract:Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL. These methods usually rely on contrastive learning and data augmentation to train a transition model, which is different from how the model is used in RL---performing value-based planning. Accordingly, the learned representation by these visual methods may be good for recognition but not optimal for estimating state value and solving the decision problem. To address this issue, we propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. More specifically, VCR trains a model to predict the future state (also referred to as the "imagined state'') based on the current one and a sequence of actions. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a Q value head on both of the states and obtains two distributions of action values. Then a distance is computed and minimized to force the imagined state to produce a similar action value prediction as that by the real state. We develop two implementations of the above idea for the discrete and continuous action spaces respectively. We conduct experiments on Atari 100k and DeepMind Control Suite benchmarks to validate their effectiveness for improving sample efficiency. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.

State Representation Learning for Effective Deep Reinforcement Learning.

State Representation Learning with Adjacent State Consistency Loss for Deep Reinforcement Learning.

S2RL: DoWe Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

A State Representation Dueling Network for Deep Reinforcement Learning

Adaptive Visual Servo Regulation Control for Camera-in-Hand Configuration with a Fixed Camera Extension1

Enhancing Visual Reinforcement Learning with State-Action Representation

State of the Art Control of Atari Games Using Shallow Reinforcement Learning

Episodic Reinforcement Learning with Expanded State-reward Space

Learning Controllable Elements Oriented Representations for Reinforcement Learning

Neural Episodic Control with State Abstraction

Bootstrapped Representations in Reinforcement Learning

Unsupervised State Representation Learning in Atari

LLM-Empowered State Representation for Reinforcement Learning

Improving Deep Reinforcement Learning with Mirror Loss

For SALE: State-Action Representation Learning for Deep Reinforcement Learning

Unsupervised Representation Learning in Partially Observable Atari Games

Unified State Representation Learning under Data Augmentation

EXPLORATORY MOTIVATION AND ANIMAL HANDLING: THE EFFECT ON RUNWAY PERFORMANCE OF START-BOX EXPOSURE TIME.

Learning State Representations via Retracing in Reinforcement Learning