Abstract:Recent years have witnessed great potential in applying Deep Reinforcement Learning (DRL) in various challenging applications, such as autonomous driving, nuclear fusion control, complex game playing, etc. However, recently researchers have revealed that deep reinforcement learning models are vulnerable to adversarial attacks: malicious attackers can train adversarial policies to tamper with the observations of a well-trained victim agent, the latter of which fails dramatically when faced with such an attack. Understanding and improving the adversarial robustness of deep reinforcement learning is of great importance in enhancing the quality and reliability of a wide range of DRL-enabled systems. In this paper, we develop curiosity-driven and victim-aware adversarial policy training, a novel method that can more effectively exploit the defects of victim agents. To be victim-aware, we build a surrogate network that can approximate the state-value function of a black-box victim to collect the victim's information. Then we propose a curiosity-driven approach, which encourages an adversarial policy to utilize the information from the hidden layer of the surrogate network to exploit the vulnerability of victims efficiently. Extensive experiments demonstrate that our proposed method outperforms or achieves a similar level of performance as the current state-of-the-art across multiple environments. We perform an ablation study to emphasize the benefits of utilizing the approximated victim information. Further analysis suggests that our method is harder to defend against a commonly used defensive strategy, which calls attention to more effective protection on the systems using DRL.

Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

Optimal Attack and Defense for Reinforcement Learning

Robust Multi-Agent Reinforcement Learning against Adversaries on Observation

Deep-Attack over the Deep Reinforcement Learning

Adversarial Policies: Attacking Deep Reinforcement Learning

Optimal Attacks on Reinforcement Learning Policies

Curiosity-Driven and Victim-Aware Adversarial Policies.

Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning

Attacking and Defending Deep Reinforcement Learning Policies

Characterizing Attacks on Deep Reinforcement Learning

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary

Strategically-timed State-Observation Attacks on Deep Reinforcement Learning Agents

Selective Real‐time Adversarial Perturbations Against Deep Reinforcement Learning Agents

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Adversarial Cheap Talk

Enhanced adversarial strategically-timed attacks against deep reinforcement learning

Multiple-Model Based Defense for Deep Reinforcement Learning Against Adversarial Attack

Adversarial Attacks on Multiagent Deep Reinforcement Learning Models in Continuous Action Space

Adversarial Attacks on Reinforcement Learning Agents for Command and Control

Robust Deep Reinforcement Learning with Adversarial Attacks