An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems Via Barrier Certificate Generation

Zhengfeng Yang,Yidan Zhang,Wang Lin,Xia Zeng,Xiaochao Tang,Zhenbing Zeng,Zhiming Liu
DOI: https://doi.org/10.1007/978-3-030-81685-8_22
2021-01-01
Abstract:In this paper, we propose a safe reinforcement learning approach to synthesize deep neural network (DNN) controllers for nonlinear systems subject to safety constraints. The proposed approach employs an iterative scheme where a learner and a verifier interact to synthesize safe DNN controllers. The learner trains a DNN controller via deep reinforcement learning, and the verifier certifies the learned controller through computing a maximal safe initial region and its corresponding barrier certificate, based on polynomial abstraction and bilinear matrix inequalities solving. Compared with the existing verification-in-the-loop synthesis methods, our iterative framework is a sequential synthesis scheme of controllers and barrier certificates, which can learn safe controllers with adaptive barrier certificates rather than user-defined ones. We implement the tool SRLBC and evaluate its performance over a set of benchmark examples. The experimental results demonstrate that our approach efficiently synthesizes safe DNN controllers even for a nonlinear system with dimension up to 12.
What problem does this paper attempt to address?