A Reinforcement Learning Method for LQR Control Problem

文锋,陈宗海,周光明,陈春林
DOI: https://doi.org/10.3969/j.issn.1003-6059.2006.03.021
2006-01-01
Pattern Recognition and Artificial Intelligence
Abstract:Current convergence analyses of reinforcement learning method are mainly applied to discrete state problems.Analyses of continuous state reinforcement learning method are limited to simple LQR control problems.After analyzing two convergent reinforcement learning methods for LQR control problem,a new method only requiring partial model information is proposed to make up for the defects of these two methods.In this method,a recursive least-squares TD method is used to estimate parameters of value function and a recursive least-squares method is used to estimate the greedily improved policy.In theoretical analysis,a convergence proof is presented for the proposed policy iteration method in ideal case.Simulation result shows that this method converges an optimal control policy.
What problem does this paper attempt to address?