Abstract:Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent's policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult. This challenge stems from the diversity of constraint representations and little exploration of their interrelations. To bridge this knowledge gap, we present a comprehensive review of representative constraint formulations, along with a curated selection of algorithms designed specifically for each formulation. In addition, we elucidate the theoretical underpinnings that reveal the mathematical mutual relations among common problem formulations. We conclude with a discussion of the current state and future directions of safe reinforcement learning research.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to systematically understand and apply different constraint representation methods in reinforcement learning (RL) to achieve safe reinforcement learning (Safe RL). Specifically, the paper focuses on the constraint representation forms and their algorithm design in constraint - based safe reinforcement learning. Although a great deal of effort has been made in recent years to enhance the safety of RL, it is still difficult to have a systematic understanding of this area, mainly due to the diversity of constraint representations and the insufficient research on their inter - relationships. To fill this knowledge gap, the paper provides a comprehensive review of representative constraint representation forms and selects algorithms specifically designed for each representation form. In addition, the paper also clarifies the theoretical basis for revealing the mathematical inter - relationships between common problem representations. Finally, the paper discusses the current state and future directions of safe reinforcement learning research. The main contributions of the paper are as follows: 1. Provide a comprehensive review of constraint representation forms in safe reinforcement learning. 2. Introduce representative algorithms for each constraint representation form. 3. Discuss the relationships between various constraint representation forms by defining three theoretical concepts: transformability, generalizability, and conservative approximation. 4. Propose two problems, namely the identical or more general safe reinforcement learning (IoMG - SafeRL) problems, to which other common problems can be transformed or conservatively approximated. Through these contributions, the paper aims to bridge the gap between different safe reinforcement learning problems and appropriate algorithms, and to lay the foundation for obtaining a systematic understanding by focusing on constraint representation to organize existing research.

A Survey of Constraint Formulations in Safe Reinforcement Learning

State-wise Safe Reinforcement Learning: A Survey

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

Safe Distributional Reinforcement Learning

Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints

Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

A Review of Safe Reinforcement Learning: Methods, Theories, and Applications

Reachability Constrained Reinforcement Learning.

Probabilistic Constraint for Safety-Critical Reinforcement Learning

Evaluating Model-free Reinforcement Learning Toward Safety-critical Tasks

Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

Constrained reinforcement learning with statewise projection: a control barrier function approach

Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate

Feasibility Consistent Representation Learning for Safe Reinforcement Learning

Resilient Constrained Reinforcement Learning

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

Concurrent Learning of Policy and Unknown Safety Constraints in Reinforcement Learning

Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning

Model-Free Safe Reinforcement Learning Through Neural Barrier Certificate