Abstract:Nash Equilibrium (NE) is the canonical solution concept of game theory, which provides an elegant tool to understand the rationalities. Though mixed strategy NE exists in any game with finite players and actions, computing NE in two- or multi-player general-sum games is PPAD-Complete. Various alternative solutions, e.g., Correlated Equilibrium (CE), and learning methods, e.g., fictitious play (FP), are proposed to approximate NE. For convenience, we call these methods as "inexact solvers", or "solvers" for short. However, the alternative solutions differ from NE and the learning methods generally fail to converge to NE. Therefore, in this work, we propose REinforcement Nash Equilibrium Solver (RENES), which trains a single policy to modify the games with different sizes and applies the solvers on the modified games where the obtained solution is evaluated on the original games. Specifically, our contributions are threefold. i) We represent the games as $\alpha$-rank response graphs and leverage graph neural network (GNN) to handle the games with different sizes as inputs; ii) We use tensor decomposition, e.g., canonical polyadic (CP), to make the dimension of modifying actions fixed for games with different sizes; iii) We train the modifying strategy for games with the widely-used proximal policy optimization (PPO) and apply the solvers to solve the modified games, where the obtained solution is evaluated on original games. Extensive experiments on large-scale normal-form games show that our method can further improve the approximation of NE of different solvers, i.e., $\alpha$-rank, CE, FP and PRD, and can be generalized to unseen games.

Posterior Sampling for Multi-Agent Reinforcement Learning: Solving Extensive Games with Imperfect Information

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation

Posterior Sampling for Continuing Environments

A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent

Monte Carlo Neural Fictitious Self-Play: Achieve Approximate Nash equilibrium of Imperfect-Information Games.

Posterior Sampling for Deep Reinforcement Learning

Learning to Play General-Sum Games against Multiple Boundedly Rational Agents

Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games

RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning

An Efficient Deep Reinforcement Learning Algorithm for Solving Imperfect Information Extensive-Form Games.

Sample-Efficient Multi-Agent RL: an Optimization Perspective.

Scalable sub-game solving for imperfect-information games

A Unified Perspective on Deep Equilibrium Finding

Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning

Reinforcement Nash Equilibrium Solver

Dueling Posterior Sampling for Preference-Based Reinforcement Learning

Combining Counterfactual Regret Minimization with Information Gain to Solve Extensive Games with Unknown Environments

A Risk-Averse Equilibrium for Multi-Agent Systems

Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning