Abstract:This paper integrates game theory, optimal control theory and reinforcement learning to deal with the discrete-time (DT) multi-player non-zero-sum game issue. As is known, the solutions to non-zero-sum game problems are the outcomes of coupled Riccati equations or coupled Hamilton–Jacobi ones, which are generally difficult to solve analytically and require the knowledge of accurate system mathematical models. However, for most practical industrial systems, the system dynamics cannot be obtained accurately or even unavailable, and the conventional model-based methods will be invalid. To overcome this deficiency, we develop data-based adaptive dynamic programming (ADP) algorithms for completely unknown multi-player systems. Firstly, the Nash equilibrium and stationarity conditions are used to formulate the DT multi-player non-zero-sum game, and then policy iteration algorithm is applied to approximate optimal solutions successively. Secondly, a novel online ADP algorithm combined with a neural-network-based identification scheme is designed and only requires the system data instead of the real system models. Subsequently, a data-driven action-dependent heuristic dynamic programming approach is presented and circumvents the estimation errors caused by the identification learning procedure. Finally, two simulation examples are provided to illustrate the feasibility of our schemes.

Online Finite-Horizon Optimal Learning Algorithm for Nonzero-Sum Games with Partially Unknown Dynamics and Constrained Inputs

Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games with Unknown Dynamics.

Model-Free Adaptive Optimal Control for Unknown Nonlinear Multiplayer Nonzero-Sum Game

Online Optimal Solutions for Multi-Player Nonzero-Sum Game with Completely Unknown Dynamics

A Single-NN Iterative Adaptive Dynamic Programming Algorithm for Continuous-Time Nonlinear Zero-Sum Games

Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games with Control Constraints

Event-triggered Adaptive Dynamic Programming for Multi-Player Zero-Sum Games with Unknown Dynamics

Data-Driven Optimal Control for Multi-Player Non-Zero-Sum Games with Unknown Dynamics

Neural-network-based Learning Algorithms for Cooperative Games of Discrete-Time Multi-Player Systems with Control Constraints Via Adaptive Dynamic Programming

Data-driven Adaptive Dynamic Programming Schemes for Non-Zero-sum Games of Unknown Discrete-Time Nonlinear Systems

Data-based Approximate Optimal Control for Nonzero-Sum Games of Multi-Player Systems Using Adaptive Dynamic Programming.

Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms

Model-free Adaptive Dynamic Programming for Online Optimal Solution of the Unknown Nonlinear Zero-Sum Differential Game

Iterative ADP Learning Algorithms for Discrete-Time Multi-Player Games.

Adaptive Dynamic Programming for Solving Non-Zero-Sum Differential Games.

Data-driven Approximate Optimal Tracking Control Schemes for Unknown Non-Affine Non-Linear Multi-Player Systems Via Adaptive Dynamic Programming

Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System

Optimal and Stable Control for Two-Player Zero-Sum Game Using Adaptive Dynamic Programming

Policy-Iteration-Based Learning for Nonlinear Player Game Systems with Constrained Inputs.

Online Finite-Horizon ADP Algorithm for Solving Non-Cooperative Differential Games