Abstract:Summary This article introduces a novel optimal trajectory tracking control scheme designed for uncertain linear discrete‐time (DT) systems. In contrast to traditional tracking control methods, our approach removes the requirement for the reference trajectory to align with the generator dynamics of an autonomous dynamical system. Moreover, it does not demand the complete desired trajectory to be known in advance, whether through the generator model or any other means. Instead, our approach can dynamically incorporate segments (finite horizons) of reference trajectories and autonomously learn an optimal control policy to track them in real time. To achieve this, we address the tracking problem by learning a time‐varying ‐function through state feedback. This ‐function is then utilized to calculate the optimal feedback gain and explicitly time‐varying feedforward control input, all without the need for prior knowledge of the system dynamics or having the complete reference trajectory in advance. Additionally, we introduce an adaptive observer to extend the applicability of the tracking control scheme to situations where full state measurements are unavailable. We rigorously establish the closed‐loop stability of our optimal adaptive control approach, both with and without the adaptive observer, employing Lyapunov theory. Moreover, we characterize the optimality of the controller with respect to the finite horizon length of the known components of the desired trajectory. To further enhance the controller's adaptability and effectiveness in multitask environments, we employ the Efficient Lifelong Learning Algorithm, which leverages a shared knowledge base within the recursive least squares algorithm for multitask ‐learning. The efficacy of our approach is substantiated through a comprehensive set of simulation results by using a power system example.

Model-Free Optimal Tracking Design With Evolving Control Strategies via Q-Learning

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

A new Q‐function structure for model‐free adaptive optimal tracking control with asymmetric constrained inputs

Optimal trajectory tracking for uncertain linear discrete‐time systems using time‐varying Q‐learning

Adaptive Learning-Based Path-Tracking Control for Unknown Vehicle Systems under Performance Optimization

A High-Order Internal Model Based Iterative Learning Control Scheme for Discrete Linear Time-Varying Systems

Human-in-the-loop Distributed Cooperative Tracking Control with Applications to Autonomous Ground Vehicles: A Data-Driven Mixed Iteration Approach

Optimal tracking control of batch processes with time-invariant state delay: Adaptive Q-learning with two-dimensional state and control policy

Realization of Exact Tracking Control for Nonlinear Systems Via a Nonrecursive Dynamic Design.

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems

H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs

Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure

Model-free Value Iteration Algorithm for Continuous-time Stochastic Linear Quadratic Optimal Control Problems

The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning

Finite-time optimal tracking control using augmented error system method

Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics

Stability Analysis of Model-Free Control under Iterative Q-learning Algorithms

Model-based reinforcement learning for infinite-horizon approximate optimal tracking

Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method

Fuzzy Adaptive Tracking of Constrained Nonlinear Systems with Event-Sampling Reinforcement Learning