A Survey of Constraint Formulations in Safe Reinforcement Learning

Akifumi Wachi,Xun Shen,Yanan Sui
2024-05-08
Abstract:Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent's policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult. This challenge stems from the diversity of constraint representations and little exploration of their interrelations. To bridge this knowledge gap, we present a comprehensive review of representative constraint formulations, along with a curated selection of algorithms designed specifically for each formulation. In addition, we elucidate the theoretical underpinnings that reveal the mathematical mutual relations among common problem formulations. We conclude with a discussion of the current state and future directions of safe reinforcement learning research.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to systematically understand and apply different constraint representation methods in reinforcement learning (RL) to achieve safe reinforcement learning (Safe RL). Specifically, the paper focuses on the constraint representation forms and their algorithm design in constraint - based safe reinforcement learning. Although a great deal of effort has been made in recent years to enhance the safety of RL, it is still difficult to have a systematic understanding of this area, mainly due to the diversity of constraint representations and the insufficient research on their inter - relationships. To fill this knowledge gap, the paper provides a comprehensive review of representative constraint representation forms and selects algorithms specifically designed for each representation form. In addition, the paper also clarifies the theoretical basis for revealing the mathematical inter - relationships between common problem representations. Finally, the paper discusses the current state and future directions of safe reinforcement learning research. The main contributions of the paper are as follows: 1. Provide a comprehensive review of constraint representation forms in safe reinforcement learning. 2. Introduce representative algorithms for each constraint representation form. 3. Discuss the relationships between various constraint representation forms by defining three theoretical concepts: transformability, generalizability, and conservative approximation. 4. Propose two problems, namely the identical or more general safe reinforcement learning (IoMG - SafeRL) problems, to which other common problems can be transformed or conservatively approximated. Through these contributions, the paper aims to bridge the gap between different safe reinforcement learning problems and appropriate algorithms, and to lay the foundation for obtaining a systematic understanding by focusing on constraint representation to organize existing research.