Value iteration for LQR control of unknown stochastic-parameter linear systems

Wenwu Fan,Junlin Xiong
DOI: https://doi.org/10.1016/j.sysconle.2024.105731
IF: 2.742
2024-01-27
Systems & Control Letters
Abstract:This paper focuses on the linear-quadratic optimal control problem for unknown stochastic-parameter linear systems using reinforcement learning methods. Based on the second moments of random system matrices, a model-based value iteration algorithm is proposed to solve the problem and is proved to be convergent by using the contraction mapping theorem. For the case without knowing any information about the random system matrices, a normalized model-free value iteration algorithm is presented to learn the optimal control law by estimating the data for the next time. The collected data are normalized first in our model-free algorithm to reduce the error of the least squares method. It is proved that our algorithm can obtain an approximate optimal solution. Our algorithm is applicable for any distribution of the random system matrices and does not require an initial mean-square stabilizing control policy. Finally, an example illustrates that our algorithm converges to an approximate optimal control policy and that the normalization step can significantly reduce convergence errors.
automation & control systems,operations research & management science
What problem does this paper attempt to address?