Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning

Yongliang Yang,Sen Zhang,Jie Dong,Yixin Yin
DOI: https://doi.org/10.1109/access.2019.2960064
IF: 3.9
2020-01-01
IEEE Access
Abstract:In this paper, we develop a data-driven algorithm to learn the Nash equilibrium solution for a two-player non-zero-sum (NZS) game with completely unknown linear discrete-time dynamics based on off-policy reinforcement learning (RL). This algorithm solves the coupled algebraic Riccati equations (CARE) forward in time in a model-free manner by using the online measured data. We first derive the CARE for solving the two-player NZS game. Then, model-free off-policy RL is developed to obviate the requirement of complete knowledge of system dynamics. Besides, on- and off-policy RL algorithms are compared in terms of the robustness against the probing noise. Finally, a simulation example is presented to show the efficacy of the presented approach.
What problem does this paper attempt to address?