Abstract:In offline reinforcement learning (RL), an RL agent learns to solve a task using only a fixed dataset of previously collected data. While offline RL has been successful in learning real-world robot control policies, it typically requires large amounts of expert-quality data to learn effective policies that generalize to out-of-distribution states. Unfortunately, such data is often difficult and expensive to acquire in real-world tasks. Several recent works have leveraged data augmentation (DA) to inexpensively generate additional data, but most DA works apply augmentations in a random fashion and ultimately produce highly suboptimal augmented experience. In this work, we propose Guided Data Augmentation (GuDA), a human-guided DA framework that generates expert-quality augmented data. The key insight behind GuDA is that while it may be difficult to demonstrate the sequence of actions required to produce expert data, a user can often easily characterize when an augmented trajectory segment represents progress toward task completion. Thus, a user can restrict the space of possible augmentations to automatically reject suboptimal augmented data. To extract a policy from GuDA, we use off-the-shelf offline reinforcement learning and behavior cloning algorithms. We evaluate GuDA on a physical robot soccer task as well as simulated D4RL navigation tasks, a simulated autonomous driving task, and a simulated soccer task. Empirically, GuDA enables learning given a small initial dataset of potentially suboptimal experience and outperforms a random DA strategy as well as a model-based DA strategy.

ACAMDA: Improving Data Efficiency in Reinforcement Learning Through Guided Counterfactual Data Augmentation

Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation

Counterfactual Adversarial Learning with Representation Interpolation

Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates

AD-AUG: Adversarial Data Augmentation for Counterfactual Recommendation

Causal Action Influence Aware Counterfactual Data Augmentation

Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning

Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory

Revisiting Data Augmentation in Deep Reinforcement Learning

Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

Implicit Counterfactual Data Augmentation for Robust Learning

Towards Controlled Data Augmentations for Active Learning.

Active Learning with Controllable Augmentation Induced Acquisition

AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

A Guide for Practical Use of ADMG Causal Data Augmentation

Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

Implicit Counterfactual Data Augmentation for Deep Neural Networks.

Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation.

Don’t Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning