Abstract:Bringing dynamic robots into the wild requires a tenuous balance between performance and safety. Yet controllers designed to provide robust safety guarantees often result in conservative behavior, and tuning these controllers to find the ideal trade-off between performance and safety typically requires domain expertise or a carefully constructed reward function. This work presents a design paradigm for systematically achieving behaviors that balance performance and robust safety by integrating safety-aware Preference-Based Learning (PBL) with Control Barrier Functions (CBFs). Fusing these concepts -- safety-aware learning and safety-critical control -- gives a robust means to achieve safe behaviors on complex robotic systems in practice. We demonstrate the capability of this design paradigm to achieve safe and performant perception-based autonomous operation of a quadrupedal robot both in simulation and experimentally on hardware.

What problem does this paper attempt to address?

This paper attempts to solve the key problem of how to balance performance and safety in dynamic robot control. Specifically, the paper focuses on designing a method that can systematically achieve high - performance and high - safety behaviors, especially in complex robot systems. Although traditional controllers can provide strong safety guarantees, they often lead to conservative behaviors, and adjusting these controllers to find the ideal balance between performance and safety usually requires domain experts or carefully designed reward functions. Therefore, this paper proposes a new design paradigm to achieve this goal by combining preference - based learning (PBL) with control barrier functions (CBFs). ### Main Contributions 1. **Proposed Safety - Aware LineCoSpar (SA - LineCoSpar)**: This is an improved version of the LineCoSpar algorithm that can perform preference - based Bayesian optimization in high - dimensional parameter spaces while considering safety. 2. **Combined Measured Robust CBFs (MR - CBFs) and Input - State - Safety CBFs (ISSf - CBFs)**: These two methods deal with measurement uncertainty and perturbations respectively, and achieve provable safety guarantees through multi - layer safety - critical control with reduced order. 3. **Verified on a quadruped robot**: This method has been tested not only in a simulation environment but also in actual hardware experiments in laboratory and outdoor environments, demonstrating its ability to operate a quadruped robot autonomously based on perception. ### Method Overview - **Preference - Based Learning (PBL)**: Adjust design parameters through users' subjective feedback (such as pairwise preferences and ordinal labels), thus avoiding the difficult problem of explicitly defining reward functions. - **Control Barrier Functions (CBFs)**: Used to ensure the safety of the system, especially in the presence of measurement uncertainty and perturbations. - **Safety - Aware LineCoSpar (SA - LineCoSpar)**: By combining PBL and CBFs, ensure that unsafe behaviors are avoided during the exploration process while maintaining the performance of the system. ### Experimental Results - **Simulation and Actual Hardware Experiments**: Obstacle avoidance tasks in indoor and outdoor environments were carried out on the quadruped robot Unitree A1. The experimental results show that the TR - OP parameters adjusted by SA - LineCoSpar can effectively navigate between obstacles while ensuring the safety of the system. ### Formulas - **Definition of CBFs**: \[ \sup_{v \in \mathbb{R}^m} \left( L_f h(x, \rho) + L_g h(x, \rho) v \right) > -\alpha(h(x, \rho)) \] where \( L_f h(x, \rho) \) and \( L_g h(x, \rho) \) represent the Lie derivatives of \( h \) with respect to \( f \) and \( g \) respectively. - **Definition of ISSf - CBFs**: \[ \sup_{v \in \mathbb{R}^m} \left( L_f h(x, \rho) + L_g h(x, \rho) v - \phi \| L_g h(x, \rho) \|^2 \right) > -\alpha(h(x, \rho)) \] - **Definition of MR - CBFs**: \[ \sup_{v \in \mathbb{R}^m} \left( L_f h(x, \hat{\rho}) + L_g h(x, \hat{\rho}) v - a - b \| v \| \right) > -\alpha(h(x, \hat{\rho})) \] - **Safety Filter of TR - OP**: \[ k(x) = \arg\min_{v \in \mathbb{R}^m} \| v - k_{\text{nom}}(x) \|^2 \] Constraints:

Safety-Aware Preference-Based Learning for Safety-Critical Control

Whole-body Dynamic Collision Avoidance with Time-varying Control Barrier Functions

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

A Learning-Based Framework for Safe Human-Robot Collaboration with Multiple Backup Control Barrier Functions

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Learning Safe, Generalizable Perception-Based Hybrid Control With Certificates

Learning for Safety-Critical Control with Control Barrier Functions

Learning Adaptive Safety for Multi-Agent Systems

Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions

Learning for Layered Safety-Critical Control with Predictive Control Barrier Functions

Bayesian Learning-Based Adaptive Control for Safety Critical Systems

Robust Safe Learning and Control in An Unknown Environment: An Uncertainty-Separated Control Barrier Function Approach

Sablas: Learning Safe Control for Black-Box Dynamical Systems

Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning

Learning Local Control Barrier Functions for Hybrid Systems

Safety-Critical Model Predictive Control with Discrete-Time Control Barrier Function

Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions

A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

Probabilistically safe controllers based on control barrier functions and scenario model predictive control

Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions