Abstract:Inspired by Nash game theory, a multiplayer mixed-zero-sum (MZS) nonlinear game considering both two situations [zero-sum and nonzero-sum (NZS) Nash games] is proposed in this paper. A synchronous reinforcement learning (RL) scheme based on the identifier-critic structure is developed to learn the Nash equilibrium solution of the proposed MZS game. First, the MZS game formulation is presented, where the performance indexes for players 1 to N - 1 and N NZS Nash game are presented, and another performance index for players N and N + 1 zero-sum game is presented, such that player N cooperates with players 1 to N - 1, while competes with player N + 1, which leads to a Nash equilibrium of all players. A single-layer neural network (NN) is then used to approximate the unknown dynamics of the nonlinear game system. Finally, an RL scheme based on NNs is developed to learn the optimal performance indexes, which can be used to produce the optimal control policy of every player such that Nash equilibrium can be obtained. Thus, the widely used actor NN in RL literature is not needed. To this end, a recently proposed adaptive law is used to estimate the unknown identifier coefficient vectors, and an improved adaptive law with the error performance index is further developed to update the critic coefficient vectors. Both linear and nonlinear simulations are presented to demonstrate the existence of Nash equilibrium for MZS game and performance of the proposed algorithm.

Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning

Model-Free Solution to the Discrete-Time Coupled Riccati Equation Using Off-Policy Reinforcement Learning

Discrete-Time Non-Zero-Sum Games With Completely Unknown Dynamics

Non‐zero‐sum games of discrete‐time Markov jump systems with unknown dynamics: An off‐policy reinforcement learning method

Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game with Reinforcement Learning

Off-policy Integral Reinforcement Learning Algorithm in Dealing with Nonzero Sum Game for Nonlinear Distributed Parameter Systems.

Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics

Data-Driven Integral Reinforcement Learning for Continuous-Time Non-Zero-Sum Games

Data-driven Adaptive Dynamic Programming Schemes for Non-Zero-sum Games of Unknown Discrete-Time Nonlinear Systems

Reinforcement Learning Based Solution To Two-Player Zero-Sum Game Using Differentiator

Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games.

Optimal Tracking Control for Multi-player Non-Zero-Sum Games of Continuous-Time Linear Systems with Unknown Dynamics.

Off-policy Based Adaptive Dynamic Programming Method for Nonzero-Sum Games on Discrete-Time System

Optimal Tracking Control for Non-Zero-sum Games of Linear Discrete-Time Systems Via Off-Policy Reinforcement Learning

Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning

Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

Model‐free Adaptive Optimal Control of Continuous‐time Nonlinear Non‐zero‐sum Games Based on Reinforcement Learning

Model-Free Temporal Difference Learning For Non-Zero-Sum Games

Integral Reinforcement Learning Off-Policy Method for Solving Nonlinear Multi-Player Nonzero-Sum Games with Saturated Actuator.

Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems

Policy Iteration Based Q-learning for Linear Nonzero-Sum Quadratic Differential Games.