Abstract:Safety is the major consideration in controlling complex dynamical systems using reinforcement learning (RL), where the safety certificate can provide provable safety guarantee. A valid safety certificate is an energy function indicating that safe states are with low energy, and there exists a corresponding safe control policy that allows the energy function to always dissipate. The safety certificate and the safe control policy are closely related to each other and both challenging to synthesize. Therefore, existing learning-based studies treat either of them as prior knowledge to learn the other, which limits their applicability with general unknown dynamics. This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificate and learns the safe control policy with CRL. We do not rely on prior knowledge about either an available model-based controller or a perfect safety certificate. In particular, we formulate a loss function to optimize the safety certificate parameters by minimizing the occurrence of energy increases. By adding this optimization procedure as an outer loop to the Lagrangian-based constrained reinforcement learning (CRL), we jointly update the policy and safety certificate parameters and prove that they will converge to their respective local optima, the optimal safe policy and a valid safety certificate. We evaluate our algorithms on multiple safety-critical benchmark environments. The results show that the proposed algorithm learns provably safe policies with no constraint violation. The validity or feasibility of synthesized safety certificate is also verified numerically.

Safe Transfer-Reinforcement-Learning-Based Optimal Control of Nonlinear Systems

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Safety reinforcement learning control via transfer learning

Accelerating Reinforcement Learning with Local Data Enhancement for Process Control

Train Trajectory Optimization with High-Risk State Space Boundaries: A Safe Reinforcement Learning Approach

Robust Safe Reinforcement Learning Control of Unknown Continuous-Time Nonlinear Systems with State Constraints and Disturbances

Safe Reinforcement Learning Using Robust Control Barrier Functions

Control invariant set enhanced reinforcement learning for process control: improved sampling efficiency and guaranteed stability

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustness

Machine learning model‐based optimal tracking control of nonlinear affine systems with safety constraints

Model-Based Safe Reinforcement Learning With Time-Varying Constraints: Applications to Intelligent Vehicles

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Adaptive Aggregation for Safety-Critical Control

Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning

Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee

Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems

Safety-Enhanced Self-Learning for Optimal Power Converter Control

Learning to be Safe: Deep RL with a Safety Critic

Reinforcement Learning-Based Optimal Fault-Tolerant Tracking Control of Industrial Processes