Homotopic policy iteration-based learning design for unknown linear continuous-time systems
Ci Chen,Frank L. Lewis,Bo Li
DOI: https://doi.org/10.1016/j.automatica.2021.110153
IF: 6.4
2022-04-01
Automatica
Abstract:Recent results have emerged that policy iteration is a powerful reinforcement learning tool in designing a stabilizing control policy for continuous-time systems with unknown system dynamics. Policy iteration involves a model-based initialization stage, i.e., seeking an initial stabilizing control policy, which is, however, dependent on the full system dynamics including the drift dynamics and system input matrix. To remove such model requirements, this paper utilizes a homotopy-based initialization strategy for policy iteration, wherein a stabilizing control policy for continuous-time systems is obtained by gradually moving a stable system to the original system. We propose two homotopic policy iteration-based stabilizing control schemes, namely, a model-based design and a model-free design using system data, which are proved to place unstable poles into a stable region. The effectiveness of the proposed designs is validated through an illustrative example.
automation & control systems,engineering, electrical & electronic