Abstract:A two-player, zero-sum stochastic game with two variables (z, v) can be transformed into the coupled control PDEs by a transition probability matrix of the Markov chain. There exist two control variables sigma(i) and two tensity rates mu(i) (i = 1; 2) after the transformation that should be solved for. Yuan and Li (Computational Economics, 2018) give a numerical algorithm by the first-order necessary condition, which overcomes several drawbacks of the normal algorithm and describe several numerical experiments demonstrating the performance of their method. However, their method exhibits at least one shortcoming, which is that the final value of the control variables sigma(i) (i = 1; 2) may exceed the definition domain. This situation means that the sigma(i) (i = 1; 2) is infeasible and unacceptable. This paper presents a new technique to avoid that situation, and an optimization pseudo-algorithm is designed using the following steps: (i) starting from the given initial points (sigma(0)(i), mu(0)(i)); an active-set algorithm is proposed; (ii) the limited memory update technique is used in the algorithm to obtain fast convergence and low storage; (iii) global convergence is established under suitable conditions; and (iv) numerical results are reported to demonstrate that the new algorithm is competitive with the normal algorithm.

GPI-Based Design for Partially Unknown Nonlinear Two-Player Zero-Sum Games

Policy Iteration <i>Q</i>-Learning for Data-Based Two-Player Zero-Sum Game of Linear Discrete-Time Systems

Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms

Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics

Model-Free Adaptive Optimal Control for Unknown Nonlinear Multiplayer Nonzero-Sum Game

A Numerical Optimization Pesudo-Algorithm for Two-Player Zero-Sum Stochastic Games

Robust Adaptive Dynamic Programming for A Three-Player Zero-Sum Differential Game with Unmatched Uncertainties

Novel single-loop policy iteration for linear zero-sum games

Differential Dynamic Programming for Finite-Horizon Zero-Sum Differential Games of Nonlinear Systems

Robust Adaptive Dynamic Programming for A Zero-Sum Differential Game

Design of zero-determinant strategies and its application to networked repeated games

Neural-network-based Zero-Sum Game for Discrete-Time Nonlinear Systems Via Iterative Adaptive Dynamic Programming Algorithm

Risk-Minimizing Two-Player Zero-Sum Stochastic Differential Game via Path Integral Control

Online Optimal Solutions for Multi-Player Nonzero-Sum Game with Completely Unknown Dynamics

Adaptive Dynamic Programming for a Nonlinear Two‐Player Non‐Zero‐Sum Differential Game With State and Input Constraints

Two-Player Zero-Sum Hybrid Games

Approximate N-Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System

Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games with Unknown Dynamics.

Approximate Solution for Three-Player Mixed-Zero-Sum Nonlinear Game via ADP Structure

A Fixed-Point Policy-Iteration-Type Algorithm for Symmetric Nonzero-Sum Stochastic Impulse Control Games