Parallel Optimal Tracking Control Schemes for Mode-Dependent Control of Coupled Markov Jump Systems Via Integral RL Method
Kun Zhang,Hua-guang Zhang,Yuliang Cai,Rong Su
DOI: https://doi.org/10.1109/tase.2019.2948431
IF: 6.636
2020-01-01
IEEE Transactions on Automation Science and Engineering
Abstract:This article is concerned with the optimal tracking control problem of the coupled Markov jump system (CMJS) by using the reinforcement learning (RL) technique. Based on the conventional optimal tracking architecture, an offline tracking iteration algorithm is first designed to solve the coupled algebraic Riccati equation that can hardly he solved by mathematical methods directly. To overcome the crucial requirements and existing shortcomings in the offline tracking method, a novel integral RL (IRL) tracking algorithm is first proposed for CMJS, which develops a transition-probability-free optimal tracking control scheme with a reconstructed augmented system and discounted cost function. Both the requirements of transition probability pi(ij) and system matrix A(i) are avoided via the designed IRI, algorithm. The stability and convergence of the novel schemes are proved by the Lyapunov theory, and the tracking objective is achieved as desired. Finally, we apply the designed algorithms in a fourth-order Markov jump control problem and the stochastic mass, spring, and damper system to track continuous sinusoidal waveforms, and the simulation results are provided to show the effectiveness and applicability. Note to Practitioners-In the practical engineering systems, many useful signals and interference vary randomly. Therefore, the tracking control of stochastic systems and dynamics, such as the Markovion, Ito's, Wiener, and Martingale processes, plays an important role in the modern industry. As a matter of fact, it is always desired to reduce the requirement of exact information and transition probability in the homogeneous Markovian process, which is very difficult to obtain accurate measurements. One way is integrating the adaptive reinforcement learning (RI) technique into the Markovian systems to learn this implicit information. However, a major restriction of the RL technique is that the control policy should be related to the finite performance index, which generally invalidates the optimal tracking solutions. In order to tackle this difficulty, by designing a novel parallel scheme via integral RL (IRL) technique, the solution of the coupled algebraic Riccati equation is solved, and the transition probability can be completely unknown during the learning process.