Abstract:Deep Reinforcement Learning (DRL) is an approach for training autonomous agents across various complex environments. Despite its significant performance in well known environments, it remains susceptible to minor conditions variations, raising concerns about its reliability in real-world applications. To improve usability, DRL must demonstrate trustworthiness and robustness. A way to improve robustness of DRL to unknown changes in the conditions is through Adversarial Training, by training the agent against well suited adversarial attacks on the dynamics of the environment. Addressing this critical issue, our work presents an in-depth analysis of contemporary adversarial attack methodologies, systematically categorizing them and comparing their objectives and operational mechanisms. This classification offers a detailed insight into how adversarial attacks effectively act for evaluating the resilience of DRL agents, thereby paving the way for enhancing their robustness.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the robustness and reliability of deep reinforcement learning (DRL) in the face of environmental condition changes. Although DRL performs well in known environments, in practical applications, it is very sensitive to minor condition changes, which raises concerns about its reliability. In order to enhance the robustness of DRL under unknown condition changes, this paper explores methods to improve DRL through adversarial training, that is, by training agents under adversarial attacks to evaluate and improve their robustness. Specifically, this paper mainly focuses on the following aspects: 1. **Robustness issues**: - DRL agents perform well in simulated environments, but when transferred to real - world applications, their performance may decline, which is the so - called "reality gap". - Perturbations in the real world (such as sensor failures, differences in physical characteristics, etc.) may cause DRL agents to make wrong decisions or experience performance degradation. - Adversarial attacks can generate intentionally designed small perturbations, which are aimed at misleading neural network decisions, thereby revealing the vulnerability of DRL agents. 2. **Adversarial training**: - Through adversarial training, adversarial samples can be introduced in the training stage, so that agents can learn to deal with various possible perturbations, thereby improving their robustness during deployment. - The goal of adversarial training is to enable agents to maintain good performance in the face of unknown condition changes. 3. **Classification and comparison**: - The paper systematically classifies and compares existing adversarial attack methods and divides them into two categories: observation alterations and dynamics alterations. - Observation alterations refer to changing the observation values received by agents, while dynamics alterations refer to changing the dynamic characteristics of the environment, such as the state - transition function. 4. **Contributions**: - Formally defines the concept of robustness in DRL. - Proposes a new classification system, organizing all types of perturbations into a unified model. - Reviews and classifies the adversarial attack methods in the existing literature. - Explores how to use adversarial attacks to improve the robustness of DRL agents. Through these studies, the paper aims to provide a comprehensive framework for understanding and improving the robustness of DRL agents, making them more reliable and trustworthy in real - world applications. ### Formula summary - **Cumulative reward**: \[ R(\tau)=\sum_{t = 0}^{|\tau|}\gamma^tR(s_t,a_t,s_{t + 1}) \] - **Optimal policy**: \[ \pi^*=\arg\max_{\pi}E_{\tau\sim\pi_{\Omega}}[R(\tau)] \] - **Value function**: \[ V^{\pi}(s)=E_{\tau\sim\pi_{\Omega}}[R(\tau)|s_0 = s] \] - **Q - value function**: \[ Q^{\pi}(s,a)=E_{\tau\sim\pi_{\Omega}}[R(\tau)|s_0 = s,a_0 = a] \] - **Adversarial sample generation**: \[ \min_{x'}\|x - x'\|\quad\text{s.t.}\quad f_{\theta}(x)\neq f_{\theta}(x') \] - **Robust optimization problem**: \[ \pi^*=\arg\max_{\pi}\min_{\tilde{\Phi}\in R}E_{\phi\sim\tilde{\Phi}(\phi|\pi)}E_{\tau\sim\pi_{\phi,\Omega}}[R(\tau)] \] These formulas show D

Robust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey

Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement Learning

Robustifying Reinforcement Learning Agents via Action Space Adversarial Training

Robust Deep Reinforcement Learning with Adversarial Attacks

Exploring the Vulnerability of Deep Reinforcement Learning-based Emergency Control for Low Carbon Power Systems

Adversarial robustness of deep reinforcement learning-based intrusion detection

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Deep Reinforcement Learning for Autonomous Cyber Defence: A Survey

Characterizing Attacks on Deep Reinforcement Learning

Security and Privacy Issues in Deep Reinforcement Learning: Threats and Countermeasures

Robust Reinforcement Learning: A Review of Foundations and Recent Advances

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Online Robustness Training for Deep Reinforcement Learning

A Survey on Reinforcement Learning Security with Application to Autonomous Driving

Improving Robustness of Reinforcement Learning for Power System Control with Adversarial Training

Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes

Reinforcement Learning-Based Approaches for Enhancing Security and Resilience in Smart Control: A Survey on Attack and Defense Methods

Attacking and Defending Deep Reinforcement Learning Policies

RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies

Transferable Adversarial Attacks on Deep Reinforcement Learning with Domain Randomization

Deep-Attack over the Deep Reinforcement Learning