Safe Inverse Reinforcement Learning via Control Barrier Function

Yue Yang,Letian Chen,Matthew Gombolay
DOI: https://doi.org/10.48550/arXiv.2212.02753
2023-03-07
Abstract:Learning from Demonstration (LfD) is a powerful method for enabling robots to perform novel tasks as it is often more tractable for a non-roboticist end-user to demonstrate the desired skill and for the robot to efficiently learn from the associated data than for a human to engineer a reward function for the robot to learn the skill via reinforcement learning (RL). Safety issues arise in modern LfD techniques, e.g., Inverse Reinforcement Learning (IRL), just as they do for RL; yet, safe learning in LfD has received little attention. In the context of agile robots, safety is especially vital due to the possibility of robot-environment collision, robot-human collision, and damage to the robot. In this paper, we propose a safe IRL framework, CBFIRL, that leverages the Control Barrier Function (CBF) to enhance the safety of the IRL policy. The core idea of CBFIRL is to combine a loss function inspired by CBF requirements with the objective in an IRL method, both of which are jointly optimized via gradient descent. In the experiments, we show our framework performs safer compared to IRL methods without CBF, that is $\sim15\%$ and $\sim20\%$ improvement for two levels of difficulty of a 2D racecar domain and $\sim 50\%$ improvement for a 3D drone domain.
Robotics,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the safety issue in inverse reinforcement learning (IRL) when learning from demonstration (LfD). Specifically, the paper focuses on how to improve the safety of IRL methods by combining control barrier functions (CBF) during the execution of tasks by agile robots (such as racing cars and drones), preventing the robots from entering dangerous states or colliding. Although traditional IRL methods can learn behavior strategies from demonstrations by human experts, they are deficient in terms of safety. Especially in complex and dynamic environments, robots may be damaged or cause accidents due to performing unsafe actions. Therefore, this paper proposes a new framework, CBFIRL, which aims to directly enhance the safety of IRL strategies by optimizing the loss function that includes CBF requirements, thereby significantly reducing the occurrence of safety events such as collisions while ensuring performance.