Model-Free Solution to the Discrete-Time Coupled Riccati Equation Using Off-Policy Reinforcement Learning

Lu Li,Liming Wang,Yongliang Yang,Jie Dong,Yixin Yin,Shusen Cheng
DOI: https://doi.org/10.23919/chicc.2019.8865951
2019-01-01
Abstract:In this paper, in order to solve the two-player nonzero-sum (NZS) differential games with completely unknown linear discrete-time dynamics, we develop a data-driven algorithm to learn the Nash equilibrium based on off-policy reinforcement learning (RL). This algorithm is a fully model-free method, which solves the couple algebraic Riccati equations (CAREs) forward in time using measured data along the system trajectories. It is shown that the two-player NZS differential games results in solving the CAREs. Then, model-based on-policy and model-free off-policy RL algorithms are presented to solve the CAREs. Compared to the on-policy RL, the off-policy RL algorithm can eliminate the influence of probing noise to guarantee unbiased solutions. Finally, a simulation example is carried out to show the efficacy of the proposed approach.
What problem does this paper attempt to address?