Abstract:Two-team zero-sum games are one of the most important paradigms in game theory. In this paper, we focus on finding an unexploitable equilibrium in large team games. An unexploitable equilibrium is a worst-case policy, where members in the opponent team cannot increase their team reward by taking any policy, e.g., cooperatively changing to other joint policies. As an optimal unexploitable equilibrium in two-team zero-sum games, correlated-team maxmin equilibrium remains unexploitable even in the worst case where players in the opponent team can achieve arbitrary cooperation through a joint team policy. However, finding such an equilibrium in large games is challenging due to the impracticality of evaluating the exponentially large number of joint policies. To solve this problem, we first introduce a general solution concept called restricted correlated-team maxmin equilibrium, which solves the problem of being impossible to evaluate all joint policy by a sample factor while avoiding an exploitation problem under the incomplete joint policy evaluation. We then develop an efficient sequential correlation mechanism, and based on which we propose an algorithm for approximating the unexploitable equilibrium in large games. We show that our approach achieves lower exploitability than the state-of-the-art baseline when encountering opponent teams with different exploitation ability in large team games including Google Research Football.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to find an unexploitable equilibrium in large - scale two - team zero - sum games. Specifically, the paper focuses on finding a worst - case strategy in large - team games, that is, members in the opposing team cannot increase their team rewards by adopting any other joint strategies. For example, members of the opposing team cannot improve their payoffs by changing to other joint strategies through cooperation. ### Core challenges of the problem 1. **Computational complexity**: In large - team games, it is impractical to evaluate all possible joint strategies because the number of joint strategies grows exponentially with the team size. 2. **Limitations of existing methods**: Existing algorithms (such as Team - PSRO) cannot converge to a truly unexploitable equilibrium when dealing with large - scale team games due to restrictions on the team joint strategy space. ### Solutions proposed in the paper To address these challenges, the paper introduces a new solution concept - restricted correlated - team maxmin equilibrium (rCTME), and solves the problem in the following ways: 1. **Introducing the sample factor**: By limiting the growth rate of the number of joint strategies that need to be evaluated, the problem of exponential growth is avoided. 2. **Developing the sequential correlation mechanism**: An efficient sequential correlation mechanism is proposed, and based on this mechanism, an algorithm (S - PSRO) is designed to approximate the unexploitable equilibrium in large - scale team games. ### Formula representation - **Definition of CTME**: \[ R_j(\pi_j^*, \pi_{-j}^*) \geq R_j(\pi_j, \pi_{-j}^*) \quad \forall \pi_j \in \Pi_j \] where $\Pi_j$ is the joint strategy space of team $T_j$. - **Definition of rCTME**: \[ R_j(\pi_j^*, \pi_{-j}^*) \geq R_j(\pi_j, \pi_{-j}^*) \quad \forall \pi_j \in I_j, C_j \] where $I_j$ and $C_j$ are the individual deviation strategy space and the correlated deviation strategy space respectively. Through these methods, the paper aims to provide a more efficient and practical algorithm to find unexploitable equilibria in large - scale two - team zero - sum games.

Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games

Team Correlated Equilibria in Zero-Sum Extensive-Form Games via Tree Decompositions

Computing Ex Ante Coordinated Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games

Team-PSRO for Learning Approximate TMECor in Large Team Games Via Cooperative Reinforcement Learning

Zero-Sum Games between Large-Population Teams: Reachability-based Analysis under Mean-Field Sharing

Zero-Sum Games between Mean-Field Teams: Reachability-Based Analysis under Mean-Field Sharing

A Generic Multi-Player Transformation Algorithm for Solving Large-Scale Zero-Sum Extensive-Form Adversarial Team Games

Enhanced Equilibria-Solving via Private Information Pre-Branch Structure in Adversarial Team Games

Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization

Towards convergence to Nash equilibria in two-team zero-sum games

Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

Zero-Sum Games involving Teams against Teams: Existence of Equilibria, and Comparison and Regularity in Information

Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation

Near-Optimal Last-iterate Convergence of Policy Optimization in Zero-sum Polymatrix Markov Games

Team-Maxmin Equilibrium: Efficiency Bounds and Algorithms

Sparsified Linear Programming for Zero-Sum Equilibrium Finding

A Coupled Optimization Framework for Correlated Equilibria in Normal-Form Game

A Risk-Averse Equilibrium for Multi-Agent Systems

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers