Safe Inverse Reinforcement Learning via Control Barrier Function

Yue Yang,Letian Chen,Matthew Gombolay

DOI: https://doi.org/10.48550/arXiv.2212.02753

2023-03-07

Abstract:Learning from Demonstration (LfD) is a powerful method for enabling robots to perform novel tasks as it is often more tractable for a non-roboticist end-user to demonstrate the desired skill and for the robot to efficiently learn from the associated data than for a human to engineer a reward function for the robot to learn the skill via reinforcement learning (RL). Safety issues arise in modern LfD techniques, e.g., Inverse Reinforcement Learning (IRL), just as they do for RL; yet, safe learning in LfD has received little attention. In the context of agile robots, safety is especially vital due to the possibility of robot-environment collision, robot-human collision, and damage to the robot. In this paper, we propose a safe IRL framework, CBFIRL, that leverages the Control Barrier Function (CBF) to enhance the safety of the IRL policy. The core idea of CBFIRL is to combine a loss function inspired by CBF requirements with the objective in an IRL method, both of which are jointly optimized via gradient descent. In the experiments, we show our framework performs safer compared to IRL methods without CBF, that is $\sim15\%$ and $\sim20\%$ improvement for two levels of difficulty of a 2D racecar domain and $\sim 50\%$ improvement for a 3D drone domain.

Robotics,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the safety issue in inverse reinforcement learning (IRL) when learning from demonstration (LfD). Specifically, the paper focuses on how to improve the safety of IRL methods by combining control barrier functions (CBF) during the execution of tasks by agile robots (such as racing cars and drones), preventing the robots from entering dangerous states or colliding. Although traditional IRL methods can learn behavior strategies from demonstrations by human experts, they are deficient in terms of safety. Especially in complex and dynamic environments, robots may be damaged or cause accidents due to performing unsafe actions. Therefore, this paper proposes a new framework, CBFIRL, which aims to directly enhance the safety of IRL strategies by optimizing the loss function that includes CBF requirements, thereby significantly reducing the occurrence of safety events such as collisions while ensuring performance.

Safe Inverse Reinforcement Learning via Control Barrier Function

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Safe and Efficient Reinforcement Learning Using Disturbance-Observer-Based Control Barrier Functions

Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions

Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning

Safe Reinforcement Learning Using Robust Control Barrier Functions

Model-Free Safe Reinforcement Learning Through Neural Barrier Certificate

Optimal control barrier functions for RL based safe powertrain control

Safe Exploration in Reinforcement Learning: Training Backup Control Barrier Functions with Zero Training Time Safety Violations

Synthesizing Control Barrier Functions with Feasible Region Iteration for Safe Reinforcement Learning

Learning Control Barrier Functions and their application in Reinforcement Learning: A Survey

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Learning Adaptive Safety for Multi-Agent Systems

Learning-Based Control Barrier Function with Provably Safe Guarantees: Reducing Conservatism with Heading-Aware Safety Margin

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning

Stable and Safe Reinforcement Learning via a Barrier-Lyapunov Actor-Critic Approach

Safe Spacecraft Inspection via Deep Reinforcement Learning and Discrete Control Barrier Functions

Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes

Safe Reinforcement Learning for Dynamical Systems Using Barrier Certificates