Abstract:This article is concerned with the optimal tracking control problem of the coupled Markov jump system (CMJS) by using the reinforcement learning (RL) technique. Based on the conventional optimal tracking architecture, an offline tracking iteration algorithm is first designed to solve the coupled algebraic Riccati equation that can hardly he solved by mathematical methods directly. To overcome the crucial requirements and existing shortcomings in the offline tracking method, a novel integral RL (IRL) tracking algorithm is first proposed for CMJS, which develops a transition-probability-free optimal tracking control scheme with a reconstructed augmented system and discounted cost function. Both the requirements of transition probability pi(ij) and system matrix A(i) are avoided via the designed IRI, algorithm. The stability and convergence of the novel schemes are proved by the Lyapunov theory, and the tracking objective is achieved as desired. Finally, we apply the designed algorithms in a fourth-order Markov jump control problem and the stochastic mass, spring, and damper system to track continuous sinusoidal waveforms, and the simulation results are provided to show the effectiveness and applicability. Note to Practitioners-In the practical engineering systems, many useful signals and interference vary randomly. Therefore, the tracking control of stochastic systems and dynamics, such as the Markovion, Ito's, Wiener, and Martingale processes, plays an important role in the modern industry. As a matter of fact, it is always desired to reduce the requirement of exact information and transition probability in the homogeneous Markovian process, which is very difficult to obtain accurate measurements. One way is integrating the adaptive reinforcement learning (RI) technique into the Markovian systems to learn this implicit information. However, a major restriction of the RL technique is that the control policy should be related to the finite performance index, which generally invalidates the optimal tracking solutions. In order to tackle this difficulty, by designing a novel parallel scheme via integral RL (IRL) technique, the solution of the coupled algebraic Riccati equation is solved, and the transition probability can be completely unknown during the learning process.

Off-policy reinforcement learning for tracking control of discrete-time Markov jump linear systems with completely unknown dynamics.

Reinforcement Learning-Based $\mathcal{h}_{\infty }$ Control of 2-D Markov Jump Roesser Systems with Optimal Disturbance Attenuation

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Reinforcement Learning‐based Adaptive Optimal Tracking Algorithm for Markov Jump Systems with Partial Unknown Dynamics

Optimal tracking control for completely unknown nonlinear discrete-time Markov jump systems using data-based reinforcement learning method.

Data-Efficient Off-Policy Learning for Distributed Optimal Tracking Control of HMAS with Unidentified Exosystem Dynamics.

H∞$$ {h}_{\infty } $$ Optimal Output Tracking Control for Markov Jump Systems: A Reinforcement Learning‐based Approach

Off-Policy Reinforcement Learning for Optimal Preview Tracking Control of Linear Discrete-Time Systems with Unknown Dynamics

Reinforcement Learning-Based Optimal Control for Markov Jump Systems with Completely Unknown Dynamics

Reinforcement Learning and Adaptive Optimization of a Class of Markov Jump Systems with Completely Unknown Dynamic Information

Parallel Optimal Tracking Control Schemes for Mode-Dependent Control of Coupled Markov Jump Systems Via Integral RL Method

Optimal Tracking Control for Multi-player Non-Zero-Sum Games of Continuous-Time Linear Systems with Unknown Dynamics.

Data-driven Optimal Tracking Control for a Class of Affine Non-Linear Continuous-Time Systems with Completely Unknown Dynamics

Model-free optimal controller for discrete-time Markovian jump linear systems: A Q-learning approach

Optimal Tracking Control for Non-Zero-sum Games of Linear Discrete-Time Systems Via Off-Policy Reinforcement Learning

Finite-time L2−l∞ Tracking Control for Markov Jump Repeated Scalar Nonlinear Systems with Partly Usable Model Information

Game Theoretical Reinforcement Learning for Robust H∞ Tracking Control of Discrete-Time Linear Systems with Unknown Dynamics

Fuzzy-Based Adaptive Optimization of Unknown Discrete-Time Nonlinear Markov Jump Systems With Off-Policy Reinforcement Learning

Linear Quadratic Tracking Control of Unknown Systems: A Two-Phase Reinforcement Learning Method.

Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems Via Off-Policy Reinforcement Learning.

H∞ optimal output tracking control for Markov jump systems: A reinforcement learning‐based approach