Safety-Aware Optimal Control of Nonlinear Systems Using Off-Policy Reinforcement Learning*

Mingduo Lin,Bo Zhao,Hongbing Xia,Derong Liu
DOI: https://doi.org/10.1109/csis-iac60628.2023.10363836
2023-01-01
Abstract:In this paper, we investigate the safety-aware optimal control (SAOC) problem, which attempts to minimize a predefined performance index function while ensuring the safety of nonlinear systems. First, the barrier function-based system transformation is utilized to design an optimal control policy which maintains the system states located in the safety region. To deal with the input constraints, a non-quadratic cost function is imposed to the control input. Then, the Hamilton-Jacobi-Bellman equation is established to provide the solution of the SAOC problem. Moreover, by utilizing the off-policy Bellman equation, a data-based off-policy reinforcement learning (OPRL) algorithm is developed to obtain the safety-aware optimal controller in a model-free manner. To implement this algorithm, a data collection process with the barrier transform is executed to generate the off-policy trajectory data, and an actor-critic neural network structure with the least-square updating law is employed in the off-policy learning phase. Finally, a simulation example is provided to verify the effectiveness of the developed control method.
What problem does this paper attempt to address?