Revisiting Data Augmentation in Deep Reinforcement Learning

Jianshu Hu,Yunpeng Jiang,Paul Weng

2024-02-19

Abstract:Various data augmentation techniques have been recently proposed in image-based deep reinforcement learning (DRL). Although they empirically demonstrate the effectiveness of data augmentation for improving sample efficiency or generalization, which technique should be preferred is not always clear. To tackle this question, we analyze existing methods to better understand them and to uncover how they are connected. Notably, by expressing the variance of the Q-targets and that of the empirical actor/critic losses of these methods, we can analyze the effects of their different components and compare them. We furthermore formulate an explanation about how these methods may be affected by choosing different data augmentation transformations in calculating the target Q-values. This analysis suggests recommendations on how to exploit data augmentation in a more principled way. In addition, we include a regularization term called tangent prop, previously proposed in computer vision, but whose adaptation to DRL is novel to the best of our knowledge. We evaluate our proposition and validate our analysis in several domains. Compared to different relevant baselines, we demonstrate that it achieves state-of-the-art performance in most environments and shows higher sample efficiency and better generalization ability in some complex environments.

Machine Learning,Computer Vision and Pattern Recognition,Artificial Intelligence

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to effectively use data augmentation techniques to improve sample efficiency and generalization ability in image - based deep reinforcement learning (DRL). Although existing data augmentation methods have shown effectiveness in improving sample efficiency or generalization in practice, it is not always clear which technique is better. For this reason, the authors analyze existing methods, aiming to better understand these methods and their inter - relationships, and analyze the effects of different components and compare them by expressing the variance of the Q - target and the variance of the empirical actor/critic losses of these methods. In addition, the authors also propose a new data augmentation method, which includes a regularization term called tangent prop, and this is the first time this method has been applied in DRL. Through evaluations in multiple domains, the authors show that their method achieves state - of - the - art performance in most environments and exhibits higher sample efficiency and better generalization ability in some complex environments.

Revisiting Data Augmentation in Deep Reinforcement Learning

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation from Scratch

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

Don’t Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning

Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning

Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning

Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation.

Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning

Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Automatic Data Augmentation for Generalization in Reinforcement Learning

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory

Simple Noisy Environment Augmentation for Reinforcement Learning

Understanding Data Augmentation from a Robustness Perspective

Automatic Data Augmentation by Learning the Deterministic Policy

AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning.

ACAMDA: Improving Data Efficiency in Reinforcement Learning Through Guided Counterfactual Data Augmentation