Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games

Huiwen Xue,Jiwei Wen,Peixin Zhou,Peng Shi,Xaoli Luan
DOI: https://doi.org/10.1016/j.ins.2023.119423
IF: 8.1
2023-07-30
Information Sciences
Abstract:This paper develops model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games (NZSGs). First, coupled action and mode-dependent value functions (CAMDVFs) are built for solving a two-player NZSG and getting Nash equilibrium solutions. Second, we propose a value iteration (VI) algorithm to parallelly update policies under each mode by collecting data on different operation modes within each iterative window. Moreover, the iterative increasing convergence of the CAMDVFs is proved by introducing auxiliary functions between two adjacent iterations. It is worth pointing out that an influence function is introduced to remove abnormal data to improve the learning capability of the VI algorithm effectively. Finally, the tracking policies' validity, self-adaptability and application potential are verified by a numerical example and a generalized economic model.
computer science, information systems
What problem does this paper attempt to address?