Abstract:This paper addresses the zero‐sum game problem for strict‐feedback nonlinear multiagent systems with full‐state constraints. Specifically, this paper focuses on the zero‐sum game scenario, wherein multiple agents aim to optimize the control strategies while considering the conflicting objectives of their opponents. To handle the full‐state constraints, a one‐to‐one nonlinear mapping technique is employed to convert the original strict‐feedback system into a more manageable pure‐feedback system without state constraints. In order to find a Nash equilibrium for virtual control signals and external disturbances, a simplified reinforcement learning algorithm is proposed, which tackles the challenges posed by solving the Hamilton–Jacobi–Isaacs equation. Unlike the existing H∞ optimal control strategies that deal with matching conditions, the H∞ optimal control strategy for strict‐feedback nonlinear systems needs to address the computational complexity issue arising from the repeated derivation of the virtual controller. To overcome the high‐order virtual controller problem, an approach based on the dynamic surface technique is introduced. By incorporating an approximation term of the high‐order virtual controller into the value function, the computational complexity challenge is effectively resolved. Based on the Lyapunov stability theorem, it is proved that all signals of the closed‐loop systems are semi‐global uniformly ultimately bounded and the tracking control performance can be guaranteed. Finally, simulation results are given to verify the effectiveness of the proposed control strategy.

Q-Learning for Feedback Nash Strategy of Finite-Horizon Nonzero-Sum Difference Games

Inverse linear-quadratic nonzero-sum differential games

A novel Z-function-based completely model-free reinforcement learning method to finite-horizon zero-sum game of nonlinear system

Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate

An efficient model‐free adaptive optimal control of continuous‐time nonlinear non‐zero‐sum games based on integral reinforcement learning with exploration

Linear Quadratic Nonzero-Sum Mean-Field Stochastic Differential Games with Regime Switching

Inverse Reinforcement Learning for Identification of Linear-Quadratic Zero-Sum Differential Games

Stochastic linear-quadratic differential game with Markovian jumps in an infinite horizon

Reinforcement Learning for Inverse Non-Cooperative Linear-Quadratic Output-feedback Differential Games

Feedback and Open-Loop Nash Equilibria for LQ Infinite-Horizon Discrete-Time Dynamic Games

Linear-Quadratic Non-Zero Sum Backward Stochastic Differential Game With Overlapping Information

Discrete-Time LQ Stochastic Two-Person Nonzero-Sum Difference Games with Random Coefficients:~Open-Loop Nash Equilibrium

The Equivalence Conditions of Optimal Feedback Control-Strategy Operators for Zero-Sum Linear Quadratic Stochastic Differential Game with Random Coefficients

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

Zero‐sum game for nonlinear multiagent systems with full‐state constraints

Two person non-zero-sum linear-quadratic differential game with Markovian jumps in infinite horizon

A kind of linear quadratic non-zero sum differential game of backward stochastic differential equation with asymmetric information

Nash Equilibria for Linear Quadratic Discrete-time Dynamic Games via Iterative and Data-driven Algorithms

Open-loop and closed-loop Nash equilibria for the LQ stochastic difference game

Non‐zero‐sum games of discrete‐time Markov jump systems with unknown dynamics: An off‐policy reinforcement learning method