Neural Lyapunov Model Predictive Control: Learning Safe Global Controllers from Sub-optimal Examples

Mayank Mittal,Marco Gallieri,Alessio Quaglino,Seyed Sina Mirrazavi Salehian,Jan Koutník
DOI: https://doi.org/10.48550/arXiv.2002.10451
IF: 14.4
2020-02-21
Artificial Intelligence
Abstract:With a growing interest in data-driven control techniques, Model Predictive Control (MPC) provides an opportunity to exploit the surplus of data reliably, particularly while taking safety and stability into account. In many real-world and industrial applications, it is typical to have an existing control strategy, for instance, execution from a human operator. The objective of this work is to improve upon this unknown, safe but suboptimal policy by learning a new controller that retains safety and stability. Learning how to be safe is achieved directly from data and from a knowledge of the system constraints. The proposed algorithm alternatively learns the terminal cost and updates the MPC parameters according to a stability metric. The terminal cost is constructed as a Lyapunov function neural network with the aim of recovering or extending the stable region of the initial demonstrator using a short prediction horizon. Theorems that characterize the stability and performance of the learned MPC in the bearing of model uncertainties and sub-optimality due to function approximation are presented. The efficacy of the proposed algorithm is demonstrated on non-linear continuous control tasks with soft constraints. The proposed approach can improve upon the initial demonstrator also in practice and achieve better stability than popular reinforcement learning baselines.
What problem does this paper attempt to address?