FRIDAY: Real-time Learning DNN-based Stable LQR controller for Nonlinear Systems under Uncertain Disturbances

Takahito Fujimori
2024-12-02
Abstract:Linear Quadratic Regulator (LQR) is often combined with feedback linearization (FBL) for nonlinear systems that have the nonlinearity additive to the input. Conventional approaches estimate and cancel the nonlinearity based on the first principle or data-driven methods such as Gaussian Processes (GPs). However, the former needs an elaborate modeling process, and the latter provides a fixed learned model, which may be suffering when the model dynamics are changing. In this letter, we take a Deep Neural Network (DNN) using a real-time-updated dataset to approximate the unknown nonlinearity while the controller is running. Spectrally normalizing the weights in each time-step, we stably incorporate the DNN prediction to an LQR controller and compensate for the nonlinear term. Leveraging the property of the bounded Lipschitz constant of the DNN, we provide theoretical analysis and locally exponential stability of the proposed controller. Simulation results show that our controller significantly outperforms Baseline controllers in trajectory tracking cases.
Systems and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve real - time learning and stable control in nonlinear systems under uncertain disturbances. Specifically, the author proposes a real - time learning method based on deep neural network (DNN) to compensate for the unknown nonlinear dynamics and combines it with the linear quadratic regulator (LQR) to ensure that the controller can still operate stably in the presence of uncertainties. ### Problem Description The paper considers the state equation of a control - affine nonlinear dynamic system: \[ \dot{x} = f(x)+g(x)u \] where \(x\in\mathbb{R}^n\) is the state vector of the system and \(u\in\mathbb{R}^m\) is the input vector. To deal with this nonlinear system, the author decomposes it into a linear time - invariant (LTI) system and a nonlinear term: \[ \dot{x}=Ax + B(u + R(x, u)) \] Here \(A\) and \(B\) are time - invariant matrices, and \(R(x, u)\) represents the unknown nonlinear dynamics containing model uncertainties, called residual dynamics. ### Objectives The objective of the paper is to estimate and compensate for these unknown nonlinear dynamics through a real - time learning method, so that the LQR controller can effectively operate the linearized system. Specifically, the author hopes to achieve the following objectives: 1. **Real - time learning**: Update the data set at each time step and use stochastic gradient descent (SGD) to optimize the DNN. 2. **Stability guarantee**: Constrain the Lipschitz constant of the DNN through spectral normalization (Spectral Normalization, SN) to ensure the local exponential stability of the closed - loop system. 3. **Performance improvement**: Improve the accuracy and response speed of trajectory tracking compared to traditional adaptive controllers and simple LQR controllers. ### Main Contributions 1. **Real - time update of DNN weights**: Continuously update the weights of all layers through simple SGD instead of using complex adaptive laws. 2. **Stability guarantee**: Utilize spectral normalization techniques to ensure the stability and controllability of DNN predictions. 3. **Theoretical analysis**: Prove the local exponential stability of the proposed controller under bounded learning errors. 4. **Experimental verification**: Demonstrate the superior performance of FRIDAY in trajectory - tracking tasks through simulation results, especially in dealing with unknown dynamics. ### Experimental Results Experiments show that the trajectory - tracking performance of FRIDAY under different types of nonlinear models (parameter variation, multiple multiplicative nonlinearities, environmental force impacts) is significantly better than that of traditional adaptive controllers and LQR controllers. In particular, FRIDAY can quickly and accurately converge to the target set point and shows better robustness in complex environments. In conclusion, this paper proposes a novel framework that can achieve real - time learning and stable control under uncertain disturbances and has important theoretical and practical application values.