Adaptive Dynamic Programming for Data-Based Optimal State Regulation with Experience Replay.

Chen An,Jiaxi Zhou
DOI: https://doi.org/10.1016/j.neucom.2023.126616
IF: 6
2023-01-01
Neurocomputing
Abstract:Traditional model-based control methods require accurate system dynamics. However, the dynamics are usually unknown and it is challenging to tune the control parameters manually when controlling a complex nonlinear system. In this paper, we propose a novel reinforcement learning method that combines the advantages of a model-based method, namely Adaptive Dynamic Programming (ADP), with the actor-critic method. Specifically, a linear approximate model is chosen to obtain the estimated dynamics as part of the optimal policy. Then, a designed actor-critic structure is used to obtain the sub–policy. We provide the theoretical proof of convergence and validate the proposed method through simulation experiments. The experimental results demonstrate the effectiveness of the proposed method with smaller tracking errors and faster learning speed compared with the controller trained by the actor-critic method.
What problem does this paper attempt to address?