Data-Driven Policy Gradient Method for Optimal Output Feedback Control of LQR

Jun Xie,Yuan-Hua Ni
2024-01-01
Abstract:In recent works, direct data-driven approaches based on Willems' Fundamental Lemma have become popular in control theory. Compared to static output feedback, the approach using past finite-length input-output trajectories to construct feedback controller has a better global convergence guarantee for linear quadratic regulator problem. This paper proposes a direct data-driven policy gradient method, which can be applied to design an optimal output feedback controller. We begin by applying a non-minimum state representation composed of past input-output data to data-parameterize the original problem. Based on this, we present a novel data-driven policy optimization method to directly update the policy, where the gradient is explicitly computed by using persistently exciting raw data. Moreover, we prove that this data-driven policy gradient method exhibits global sublinear convergence. The paper concludes by presenting simulations that validate our theoretical results.
What problem does this paper attempt to address?