Abstract:Safety is the major consideration in controlling complex dynamical systems using reinforcement learning (RL), where the safety certificate can provide provable safety guarantee. A valid safety certificate is an energy function indicating that safe states are with low energy, and there exists a corresponding safe control policy that allows the energy function to always dissipate. The safety certificate and the safe control policy are closely related to each other and both challenging to synthesize. Therefore, existing learning-based studies treat either of them as prior knowledge to learn the other, which limits their applicability with general unknown dynamics. This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificate and learns the safe control policy with CRL. We do not rely on prior knowledge about either an available model-based controller or a perfect safety certificate. In particular, we formulate a loss function to optimize the safety certificate parameters by minimizing the occurrence of energy increases. By adding this optimization procedure as an outer loop to the Lagrangian-based constrained reinforcement learning (CRL), we jointly update the policy and safety certificate parameters and prove that they will converge to their respective local optima, the optimal safe policy and a valid safety certificate. We evaluate our algorithms on multiple safety-critical benchmark environments. The results show that the proposed algorithm learns provably safe policies with no constraint violation. The validity or feasibility of synthesized safety certificate is also verified numerically.

An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems Via Barrier Certificate Generation

Safe DNN-type Controller Synthesis for Nonlinear Systems via Meta Reinforcement Learning.

Learning safe neural network controllers with barrier certificates

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Formally Verifying Deep Reinforcement Learning Controllers with Lyapunov Barrier Certificates

Verified Safe Reinforcement Learning for Neural Network Dynamic Models

Safe Reinforcement Learning for Dynamical Systems Using Barrier Certificates

Scalable Synthesis of Verified Controllers in Deep Reinforcement Learning

Hybrid Controller Synthesis for Nonlinear Systems Subject to Reach-Avoid Constraints.

Simultaneous Synthesis and Verification of Neural Control Barrier Functions through Branch-and-Bound Verification-in-the-loop Training

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Formal Synthesis of Neural Barrier Certificates for Continuous Systems Via Counterexample Guided Learning

Model-Free Safe Reinforcement Learning Through Neural Barrier Certificate

Synthesizing Barrier Certificates Using Neural Networks.

Synthesizing Barrier Certificates of Neural Network Controlled Continuous Systems Via Approximations.

Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

Safe Reinforcement Learning via a Model-Free Safety Certifier

Sablas: Learning Safe Control for Black-Box Dynamical Systems

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning