Abstract:This paper presents a one-shot learning approach with performance and robustness guarantees for the linear quadratic regulator (LQR) control of stochastic linear systems. Even though data-based LQR control has been widely considered, existing results suffer either from data hungriness due to the inherently iterative nature of the optimization formulation (e.g., value learning or policy gradient reinforcement learning algorithms) or from a lack of robustness guarantees in one-shot non-iterative algorithms. To avoid data hungriness while ensuing robustness guarantees, an adaptive dynamic programming formalization of the LQR is presented that relies on solving a Bellman inequality. The control gain and the value function are directly learned by using a control-oriented approach that characterizes the closed-loop system using data and a decision variable from which the control is obtained. This closed-loop characterization is noise-dependent. The effect of the closed-loop system noise on the Bellman inequality is considered to ensure both robust stability and suboptimal performance despite ignoring the measurement noise. To ensure robust stability, it is shown that this system characterization leads to a closed-loop system with multiplicative and additive noise, enabling the application of distributional robust control techniques. The analysis of the suboptimality gap reveals that robustness can be achieved without the need for regularization or parameter tuning. The simulation results on the active car suspension problem demonstrate the superiority of the proposed method in terms of robustness and performance gap compared to existing methods.

Non-Episodic Learning for Online LQR of Unknown Linear Gaussian System

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Online reinforcement learning control of unknown nonaffine nonlinear discrete time systems

Observation-based Optimal Control Law Learning with LQR Reconstruction

Learning to Control under Uncertainty with Data-Based Iterative Linear Quadratic Regulator

Learning Robust Data-based LQG Controllers from Noisy Data

Learning from similar systems and online data-driven LQR using iterative randomised data compression

Online Actuator Selection and Controller Design for Linear Quadratic Regulation with Unknown System Model

Direct Data-Driven Discounted Infinite Horizon Linear Quadratic Regulator with Robustness Guarantees

Stability-Certified On-Policy Data-Driven LQR via Recursive Learning and Policy Gradient

Value iteration for LQR control of unknown stochastic-parameter linear systems

Episodic Linear Quadratic Regulators with Low-rank Transitions

Learning Algorithm for LQG Model with Constrained Control

Data-Driven Adversarial Online Control for Unknown Linear Systems

Learning the Linear Quadratic Regulator from Nonlinear Observations

Discrete-Time Adaptive Iterative Learning Control for High-Order Nonlinear Systems with Unknown Control Directions

A Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and Stabilization

Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks

Imitation and Transfer Learning for LQG Control

Adaptive Optimal Control with Guaranteed Convergence Rate for Continuous-Time Linear Systems with Completely Unknown Dynamics.