Reinforcement Learning-Based Optimal Control for Markov Jump Systems with Completely Unknown Dynamics

Xiongtao Shi,Yanjie Li,Chenglong Du,Chaoyang Chen,Guangdeng Zong,Weihua Gui
DOI: https://doi.org/10.1016/j.automatica.2024.111886
IF: 6.4
2025-01-01
Automatica
Abstract:In this paper, the optimal control problem of a class of unknown Markov jump systems (MJSs) is investigated via the parallel policy iteration-based reinforcement learning (PPI-RL) algorithms. First, by solving the linear parallel Lyapunov equation, a model-based PPI-RL algorithm is studied to learn the solution of nonlinear coupled algebraic Riccati equation (CARE) of MJSs with known dynamics, thereby updating the optimal control gain. Then, a novel partially model-free PPI-RL algorithm is proposed for the scenario that the dynamics of the MJS is partially unknown, in which the optimal solution of CARE is learned via the mixed input-output data of all modes. Furthermore, for the MJS with completely unknown dynamics, a completely model-free PPI-RL algorithm is developed to get the optimal control gain by removing the dependence of model information in the process of solving the optimal solution of CARE. It is proved that the proposed PPI-RL algorithms converge to the unique optimal solution of CARE for MJSs with known, partially unknown, and completely unknown dynamics, respectively. Finally, simulation results are illustrated to show the feasibility and effectiveness of the PPI-RL algorithms.
What problem does this paper attempt to address?