Abstract:Markovian jump linear systems (MJLS) are an important class of dynamical systems that arise in many control applications. In this paper, we introduce the problem of controlling unknown (discrete-time) MJLS as a new benchmark for policy-based reinforcement learning of Markov decision processes (MDPs) with mixed continuous/discrete state variables. Compared with the traditional linear quadratic regulator (LQR), our proposed problem leads to a special hybrid MDP (with mixed continuous and discrete variables) and poses significant new challenges due to the appearance of an underlying Markov jump parameter governing the mode of the system dynamics. Specifically, the state of a MJLS does not form a Markov chain and hence one cannot study the MJLS control problem as a MDP with solely continuous state variable. However, one can augment the state and the jump parameter to obtain a MDP with a mixed continuous/discrete state space. We discuss how control theory sheds light on the policy parameterization of such hybrid MDPs. Then we modify the widely used natural policy gradient method to directly learn the optimal state feedback control policy for MJLS without identifying either the system dynamics or the transition probability of the switching parameter. We implement the (data-driven) natural policy gradient method on different MJLS examples. Our simulation results suggest that the natural gradient method can efficiently learn the optimal controller for MJLS with unknown dynamics.

Model-free optimal controller for discrete-time Markovian jump linear systems: A Q-learning approach

Reinforcement Learning-Based $\mathcal{h}_{\infty }$ Control of 2-D Markov Jump Roesser Systems with Optimal Disturbance Attenuation

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

H∞$$ {h}_{\infty } $$ Optimal Output Tracking Control for Markov Jump Systems: A Reinforcement Learning‐based Approach

Stochastic LQ optimal control for Markov jumping systems with multiplicative noise using reinforcement learning

Optimal Vibration Control of a Class of Nonlinear Stochastic Systems with Markovian Jump

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Reinforcement learning‐based composite suboptimal control for Markov jump singularly perturbed systems with unknown dynamics

Reinforcement Learning-Based Direct Adaptive Optimal Control of JLQ Model

Optimal control for continuous-time Markov jump singularly perturbed systems : A hybrid reinforcement learning scheme

A Fuzzy-Model-Based Approach to Optimal Control for Nonlinear Markov Jump Singularly Perturbed Systems: A Novel Integral Reinforcement Learning Scheme

H∞ optimal output tracking control for Markov jump systems: A reinforcement learning‐based approach

Fuzzy-Based Adaptive Optimization of Unknown Discrete-Time Nonlinear Markov Jump Systems With Off-Policy Reinforcement Learning

Policy Learning of MDPs with Mixed Continuous/Discrete Variables: A Case Study on Model-Free Control of Markovian Jump Systems

Asynchronous Static Output-Feedback Control of Markovian Jump Linear Systems

Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning

Finite-time L2−l∞ Tracking Control for Markov Jump Repeated Scalar Nonlinear Systems with Partly Usable Model Information

Asynchronous Event-Triggered Output-Feedback Control of Singular Markov Jump Systems

Asynchronous Observer-Based Control for Exponential Stabilization of Markov Jump Systems.

Fuzzy $H_{\infty }$ Control of Discrete-Time Nonlinear Markov Jump Systems via a Novel Hybrid Reinforcement $Q$-Learning Method

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning