Model-Free Control of Time-Delay Systems Via Policy Gradient Based Adaptive Learning Algorithm

Yongwei Zhang,Shunchao Zhang,Bo Zhao,Derong Liu
DOI: https://doi.org/10.1109/iai50351.2020.9262213
2020-01-01
Abstract:This paper develops a model-free optimal control scheme for discrete-time nonlinear systems with time-delays by using the policy gradient based adaptive learning (PGAL) algorithm. By using the measured data, the PGAL algorithm is employed to design an optimal controller for discrete-time systems. Compared with the traditional adaptive dynamic programming algorithms, the proposed method is a data-based one and improves the control input with policy gradient. The convergence of the PGAL algorithm is proved by demonstrating that the value function converges to optimum. To implement the PGAL algorithm, an actor-critic framework is constructed to learn the optimal control law and the value function. Finally, a simulation example is presented to demonstrate the effectiveness of the developed method.
What problem does this paper attempt to address?