Abstract:This study presents an adaptive optimal tracking control method for nonlinear strict‐feedback systems with unmeasurable states and asymmetric time‐varying constraints. The proposed control scheme converts constrained systems into their unconstrained counterparts by employing an observer‐critic‐actor reinforcement learning algorithm, thereby improving stability and tracking precision without using conventional barrier Lyapunov functions. Numerical and practical simulations validate the proposed control strategy's efficacy. In this article, the problem of adaptive optimal tracking control is studied for nonlinear strict‐feedback systems. While not directly measurable, the states of these systems are subject to both time‐varying and asymmetric constraints. Bypassing the conventional barrier Lyapunov function method, the constrained system is transformed into its unconstrained counterpart, thereby obviating the need for feasibility conditions. A specially designed reinforcement learning (RL) algorithm, featuring an observer‐critic‐actor architecture, is deployed in an adaptive optimal control scheme to ensure the stabilization of the converted unconstrained system. Within this architecture, the observer estimates the unmeasurable system states, the critic evaluates the control performance, and the actor executes the control actions. Furthermore, enhancements to the RL algorithm lead to relaxed conditions of persistent excitation, and the design methodology for the observer overcomes the restrictions imposed by the Hurwitz equation. The Lyapunov stability theorem is applied for two primary purposes: to ascertain the boundedness of all signals within the closed‐loop system, and to ensure the accuracy of the output signal in tracking the desired reference trajectory. Finally, numerical and practical simulations are provided to corroborate the effectiveness of the proposed control strategy.

Nonlinear Neuro-Optimal Tracking Control via Stable Iterative Q-Learning Algorithm

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Model-Free Optimal Tracking Design With Evolving Control Strategies via Q-Learning

Discrete-Time Adaptive Iterative Learning Control for High-Order Nonlinear Systems with Unknown Control Directions

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach

Iterative Learning Control Of Varying Trajectories For Robot Manipulators

Online reinforcement learning control of unknown nonaffine nonlinear discrete time systems

NN-based asymptotic tracking control for a class of strict-feedback uncertain nonlinear systems with output constraints

Discrete-time adaptive iterative learning control with unknown control directions

Near Optimal Neural Network-based Output Feedback Control of Affine Nonlinear Discrete-Time Systems

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems

Optimal trajectory tracking for uncertain linear discrete‐time systems using time‐varying Q‐learning

Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks

Adaptive optimized backstepping tracking control for full‐state constrained nonlinear strict‐feedback systems without using barrier Lyapunov function method

Adaptive Optimal Tracking Control of Unknown Nonlinear Systems Using System Augmentation.

Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems

Lifelong Learning-Based Optimal Trajectory Tracking Control of Constrained Nonlinear Affine Systems Using Deep Neural Networks

Iterative Learning Control of Non-Identical Desired Trajectories for a Class of Nonlinear Time-Varying Systems