Abstract:In two-player zero-sum games, if both players minimize their average external regret, then the average of the strategy profiles converges to a Nash equilibrium. For n-player general-sum games, however, theoretical guarantees for regret minimization are less understood. Nonetheless, Counterfactual Regret Minimization (CFR), a popular regret minimization algorithm for extensive-form games, has generated winning three-player Texas Hold'em agents in the Annual Computer Poker Competition (ACPC). In this paper, we provide the first set of theoretical properties for regret minimization algorithms in non-zero-sum games by proving that solutions eliminate iterative strict domination. We formally define \emph{dominated actions} in extensive-form games, show that CFR avoids iteratively strictly dominated actions and strategies, and demonstrate that removing iteratively dominated actions is enough to win a mock tournament in a small poker game. In addition, for two-player non-zero-sum games, we bound the worst case performance and show that in practice, regret minimization can yield strategies very close to equilibrium. Our theoretical advancements lead us to a new modification of CFR for games with more than two players that is more efficient and may be used to generate stronger strategies than previously possible. Furthermore, we present a new three-player Texas Hold'em poker agent that was built using CFR and a novel game decomposition method. Our new agent wins the three-player events of the 2012 ACPC and defeats the winning three-player programs from previous competitions while requiring less resources to generate than the 2011 winner. Finally, we show that our CFR modification computes a strategy of equal quality to our new agent in a quarter of the time of standard CFR using half the memory.

A Fast-Convergence Method of Monte Carlo Counterfactual Regret Minimization for Imperfect Information Dynamic Games

Efficient CFR for Imperfect Information Games with Instant Updates

Accelerating Nash Equilibrium Convergence in Monte Carlo Settings Through Counterfactual Value Based Fictitious Play

Monte Carlo Continual Resolving for Online Strategy Computation in Imperfect Information Games

RM-FSP: Regret Minimization Optimizes Neural Fictitious Self-Play

Combining Counterfactual Regret Minimization with Information Gain to Solve Extensive Games with Unknown Environments

Regret-Minimizing Double Oracle for Extensive-Form Games

GPU-Accelerated Counterfactual Regret Minimization

Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent

Monte Carlo Neural Fictitious Self-Play: Achieve Approximate Nash equilibrium of Imperfect-Information Games.

Scalable sub-game solving for imperfect-information games

Integrating Dynamic Weighted Approach with Fictitious Play and Pure Counterfactual Regret Minimization for Equilibrium Finding

Faster Game Solving via Hyperparameter Schedules

Lazy-CFR: a Fast Regret Minimization Algorithm for Extensive Games with Imperfect Information.

CFR-p: Counterfactual Regret Minimization with Hierarchical Policy Abstraction, and its Application to Two-player Mahjong

On the Computational Efficiency of Adaptive and Dynamic Regret Minimization

Regret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents

Double Neural Counterfactual Regret Minimization.

Parallel Counterfactual Regret Minimization in Crowdsourcing Imperfect-information Expanded Game

Optimize Neural Fictitious Self-Play in Regret Minimization Thinking

D2CFR: Minimize Counterfactual Regret With Deep Dueling Neural Network