Abstract:In this article, a real-time online off-policy reinforcement learning (RL) method is developed for the optimal control problem of unknown continuous-time nonlinear systems. First, by applying the temporal difference technique to the iterative procedure of off-policy RL, the iterative value function and the iterative policy input can be learned in real-time online. It is proven that the fitting error of neural network (NN) weights is exponentially convergent in each iteration. Second, a model-free Hamilton–Jacobi–Bellman equation (MF-HJBE) is deduced by taking the limit of the iterative procedure of off-policy RL. In this manner, it not only eliminates system dynamics in the classical HJBE, but also vanishes the iteration index. By applying temporal difference to the MF-HJBE, a real-time online tuning rule is designed to learn the optimal value function and the optimal policy input. It is proven that the fitting error of NN weights caused by the real-time online tuning rule is exponentially convergent. Note that the two online tuning rules, the iterative one and the real-time one, use only current and previous state data extracted from system trajectories. Meanwhile, it is proven using the Lyapunov's direct method that the system solution is uniformly ultimately bounded. Finally, simulation results demonstrate the validity of the proffered method.

Reinforcement Learning Control for Nonlinear Systems Based on Elman Neural Network

Nonlinear Predictive Control With Error Compensation Based On Neural Network

Reinforcement Learning-Based Control for a Class of Nonlinear Systems with Unknown Control Directions

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Neural Network Based Multi-Step Predictive Control for Nonlinear Systems

Neural network based iterative learning predictive control design for mechatronic systems with isolated nonlinearity

Adaptive Extended State Space Predictive Control for a Kind of Nonlinear Systems.

Extended State Space Predictive Control for a Class of Nonlinear Systems

Online reinforcement learning control of unknown nonaffine nonlinear discrete time systems

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks

Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems.

Nonlinear Stable Adaptive Control Based Upon Elman Networks

Adaptive output feedback reinforcement learning control for continuous time switched stochastic nonlinear systems with unknown control coefficients and full-state constraints

Robust Adaptive Repetitive Learning Control for a Class of Time-Varying Nonlinear Systems with Unknown Control Direction

Model-free Neural Control of a Class of Nonlinear Plants

Discrete-Time Adaptive Iterative Learning Control for High-Order Nonlinear Systems with Unknown Control Directions

NN/RISE-based asymptotic tracking control of uncertain nonlinear systems

Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems Using Online Approximators

Networked Control of Nonlinear Systems under Partial Observation Using Continuous Deep Q-Learning

Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks