Off-policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems Based on Policy Iteration

Ruizhuo Song,Qinglai Wei,Wendong Xiao
DOI: https://doi.org/10.1007/s00521-015-2144-0
2016-01-01
Abstract:This paper establishes an optimal control of unknown complex-valued system. Policy iteration is used to obtain the solution of the Hamilton–Jacobi–Bellman equation. Off-policy learning allows the iterative performance index and iterative control to be obtained by completely unknown dynamics. Critic and action networks are used to get the iterative control and iterative performance index, which execute policy evaluation and policy improvement. Asymptotic stability of the closed-loop system and the convergence of the iterative performance index function are proven. By Lyapunov technique, the uniformly ultimately bounded of the weight error is proven. Simulation study demonstrates the effectiveness of the proposed optimal control method.
What problem does this paper attempt to address?