Abstract:In this paper, the issue of model uncertainty in safety-critical control is addressed with a data-driven approach. For this purpose, we utilize the structure of an input-ouput linearization controller based on a nominal model along with a Control Barrier Function and Control Lyapunov Function based Quadratic Program (CBF-CLF-QP). Specifically, we propose a novel reinforcement learning framework which learns the model uncertainty present in the CBF and CLF constraints, as well as other control-affine dynamic constraints in the quadratic program. The trained policy is combined with the nominal model-based CBF-CLF-QP, resulting in the Reinforcement Learning-based CBF-CLF-QP (RL-CBF-CLF-QP), which addresses the problem of model uncertainty in the safety constraints. The performance of the proposed method is validated by testing it on an underactuated nonlinear bipedal robot walking on randomly spaced stepping stones with one step preview, obtaining stable and safe walking under model uncertainty.

What problem does this paper attempt to address?

This paper attempts to solve the problem of model uncertainty in safety - critical control. Specifically, the paper proposes a new framework based on Reinforcement Learning (RL) for learning and compensating for model uncertainty in the Control Lyapunov Function (CLF) and Control Barrier Function (CBF), as well as other control - affine dynamic constraints in Quadratic Program (QP). This framework aims to combine the advantages of data - driven methods with the stability and safety guarantees of classical model - based control methods to address safety - critical control problems in highly uncertain dynamic systems. ### The main contributions of the paper include: 1. **Proposing a new RL framework**: This framework can simultaneously learn model uncertainty in CLF, CBF, and other control - affine dynamic constraints in one learning process. 2. **Expanding the scope of application of the method**: This method can be applied to high - relative - degree outputs and control barrier functions. 3. **Learning the uncertainty of parameterized CBF**: It depends not only on the state but also on other parameters. 4. **Numerical verification**: The effectiveness of the method is verified on an under - actuated nonlinear hybrid system with significant model uncertainty (such as a bipedal robot walking on randomly spaced pedals). ### The structure of the paper: - **Introduction**: Introduces the research background and motivation, emphasizing the importance of combining learning methods and classical control theory. - **Background knowledge**: Explains in detail the basic concepts of input - output linearization, CLF - based quadratic programming, and CBF - based quadratic programming. - **Method**: Describes step by step how to learn model uncertainty in CLF and CBF through RL and proposes the RL - CBF - CLF - QP framework. - **Experimental setup**: Describes the simulation setup on the bipedal robot, including two different simulation scenarios. - **Results**: Presents the experimental results under different model uncertainty conditions, verifying the effectiveness and robustness of the proposed method. ### Key technical details: - **Input - output linearization**: Linearizes the input - output dynamics of the system through control inputs. - **CLF and CBF**: Used to ensure the stability and safety of the system respectively. - **RL framework**: Uses the Deep Deterministic Policy Gradient (DDPG) algorithm to train RL agents to learn model uncertainty. - **Quadratic programming**: Combines the learned uncertainty in the optimization problem to ensure that the control input satisfies safety and stability constraints. ### Experimental results: - **Walking on flat ground**: Under model uncertainty conditions, the proposed RL method can maintain the stable walking of the robot and satisfy the friction constraints. - **Walking on pedals**: When walking on randomly spaced pedals, the RL - CBF - CLF - QP method can successfully place the robot's feet safely and adapt to additional uncertainties (such as increased load). In conclusion, by combining RL and classical control theory, this paper proposes an effective method to deal with the problem of model uncertainty in safety - critical control systems, which has important theoretical and application values.

Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Learning for Safety-Critical Control with Control Barrier Functions

Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions

Safe Model-Based Reinforcement Learning for Systems with Parametric Uncertainties

Robust Safe Learning and Control in An Unknown Environment: An Uncertainty-Separated Control Barrier Function Approach

Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning

Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

Robust Safety-Critical Control for Dynamic Robotics

Reinforcement Learning-Enhanced Control Barrier Functions for Robot Manipulators

Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning

Safety-Aware Learning-Based Control of Systems with Uncertainty Dependent Constraints (extended version)

Safety-Aware Preference-Based Learning for Safety-Critical Control

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions

Safe Online Dynamics Learning with Initially Unknown Models and Infeasible Safety Certificates

Learning Piecewise Residuals of Control Barrier Functions for Safety of Switching Systems using Multi-Output Gaussian Processes

Lyapunov-based uncertainty-aware safe reinforcement learning

Model-Free Safe Reinforcement Learning Through Neural Barrier Certificate

Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions

Reinforcement Learning-based Receding Horizon Control using Adaptive Control Barrier Functions for Safety-Critical Systems