Abstract:Reinforcement learning (RL) agents are vulnerable to adversarial disturbances, which can deteriorate task performance or break down safety specifications. Existing methods either address safety requirements under the assumption of no adversary (e.g., safe RL) or only focus on robustness against performance adversaries (e.g., robust RL). Learning one policy that is both safe and robust under any adversaries remains a challenging open problem. The difficulty is how to tackle two intertwined aspects in the worst cases: feasibility and optimality. The optimality is only valid inside a feasible region (i.e., robust invariant set), while the identification of maximal feasible region must rely on how to learn the optimal policy. To address this issue, we propose a systematic framework to unify safe RL and robust RL, including the problem formulation, iteration scheme, convergence analysis and practical algorithm design. The unification is built upon constrained two-player zero-sum Markov games, in which the objective for protagonist is twofold. For states inside the maximal robust invariant set, the goal is to pursue rewards under the condition of guaranteed safety; for states outside the maximal robust invariant set, the goal is to reduce the extent of constraint violation. A dual policy iteration scheme is proposed, which simultaneously optimizes a task policy and a safety policy. We prove that the iteration scheme converges to the optimal task policy which maximizes the twofold objective in the worst cases, and the optimal safety policy which stays as far away from the safety boundary. The convergence of safety policy is established by exploiting the monotone contraction property of safety self-consistency operators, and that of task policy depends on the transformation of safety constraints into state-dependent action spaces. By adding two adversarial networks (one is for safety guarantee and the other is for task performance), we propose a practical deep RL algorithm for constrained zero-sum Markov games, called dually robust actor-critic (DRAC). The evaluations with safety-critical benchmarks demonstrate that DRAC achieves high performance and persistent safety under all scenarios (no adversary, safety adversary, performance adversary), outperforming all baselines by a large margin.

RLUC: Strengthening Robustness by Attaching Constraint Considerations to Policy Network

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy

RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies

On the Robustness of Safe Reinforcement Learning under Observational Perturbations

On the Perturbed States for Transformed Input-robust Reinforcement Learning

Adversarial Policies: Attacking Deep Reinforcement Learning

Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses

Robust off-policy Reinforcement Learning via Soft Constrained Adversary

Transferable Adversarial Attacks on Deep Reinforcement Learning with Domain Randomization

Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations

Robust Multi-Agent Reinforcement Learning against Adversaries on Observation

Robust Deep Reinforcement Learning with Adversarial Attacks

Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms

Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space

Adversary Agnostic Robust Deep Reinforcement Learning

Robustifying Reinforcement Learning Agents via Action Space Adversarial Training

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

Safe Reinforcement Learning with Dual Robustness

Probabilistic Perspectives on Error Minimization in Adversarial Reinforcement Learning