Abstract:Ensuring safety of nonlinear systems under model uncertainty and external disturbances is crucial, especially for real-world control tasks. Predictive methods such as robust model predictive control (RMPC) require solving nonconvex optimization problems online, which leads to high computational burden and poor scalability. Reinforcement learning (RL) works well with complex systems, but pays the price of losing rigorous safety guarantee. This paper presents a theoretical framework that bridges the advantages of both RMPC and RL to synthesize safety filters for nonlinear systems with state- and action-dependent uncertainty. We decompose the robust invariant set (RIS) into two parts: a target set that aligns with terminal region design of RMPC, and a reach-avoid set that accounts for the rest of RIS. We propose a policy iteration approach for robust reach-avoid problems and establish its monotone convergence. This method sets the stage for an adversarial actor-critic deep RL algorithm, which simultaneously synthesizes a reach-avoid policy network, a disturbance policy network, and a reach-avoid value network. The learned reach-avoid policy network is utilized to generate nominal trajectories for online verification, which filters potentially unsafe actions that may drive the system into unsafe regions when worst-case disturbances are applied. We formulate a second-order cone programming (SOCP) approach for online verification using system level synthesis, which optimizes for the worst-case reach-avoid value of any possible trajectories. The proposed safety filter requires much lower computational complexity than RMPC and still enjoys persistent robust safety guarantee. The effectiveness of our method is illustrated through a numerical example.

Safe Online Integral Reinforcement Learning for Control Systems Via Controller Decomposition

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

Safe Transfer-Reinforcement-Learning-Based Optimal Control of Nonlinear Systems

Learning Predictive Safety Filter via Decomposition of Robust Invariant Set

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

Safety-Enhanced Self-Learning for Optimal Power Converter Control

Lagrangian-based online safe reinforcement learning for state-constrained systems

Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustness

Train Trajectory Optimization with High-Risk State Space Boundaries: A Safe Reinforcement Learning Approach

Safe adaptive output‐feedback optimal control of a class of linear systems

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning

Safe Inverse Reinforcement Learning via Control Barrier Function

Look Before You Leap: Safe Model-Based Reinforcement Learning with Human Intervention

Specialized Deep Residual Policy Safe Reinforcement Learning-Based Controller for Complex and Continuous State-Action Spaces

Safe Reinforcement Learning Using Robust Control Barrier Functions

Robust Safe Reinforcement Learning Control of Unknown Continuous-Time Nonlinear Systems with State Constraints and Disturbances

Control invariant set enhanced reinforcement learning for process control: improved sampling efficiency and guaranteed stability

State and Input Constrained Output-Feedback Adaptive Optimal Control of Affine Nonlinear Systems