Convergence of Policy Gradient for Stochastic Linear-Quadratic Control Problem in Infinite Horizon

Xinpei Zhang,Guangyan Jia
2024-04-18
Abstract:With the outstanding performance of policy gradient (PG) method in the reinforcement learning field, the convergence theory of it has aroused more and more interest recently. Meanwhile, the significant importance and abundant theoretical researches make the stochastic linear quadratic (SLQ) control problem a starting point for studying PG in model-based learning setting. In this paper, we study the PG method for the SLQ problem in infinite horizon and take a step towards providing rigorous guarantees for gradient methods. Although the cost functional of linear-quadratic problem is typically nonconvex, we still overcome the difficulty based on gradient domination condition and L-smoothness property, and prove exponential/linear convergence of gradient flow/descent algorithm.
Optimization and Control
What problem does this paper attempt to address?