Abstract:Game theory is a very profound study on distributed decision-making behavior and has been extensively developed by many scholars. However, many existing works rely on certain strict assumptions such as knowing the opponent's private behaviors, which might not be practical. In this work, we focused on two Nobel winning concepts, the Nash equilibrium and the correlated equilibrium. Specifically, we successfully reached the correlated equilibrium outside the convex hull of the Nash equilibria with our proposed deep reinforcement learning algorithm. With the correlated equilibrium probability distribution, we also propose a mathematical model to inverse the calculation of the correlated equilibrium probability distribution to estimate the opponent's payoff vector. With those payoffs, deep reinforcement learning learns why and how the rational opponent plays, instead of just learning the regions for corresponding strategies and actions. Through simulations, we showed that our proposed method can achieve the optimal correlated equilibrium and outside the convex hull of the Nash equilibrium with limited interaction among players.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to reach the Correlated Equilibrium in non - cooperative games. Specifically, the author focuses on achieving the Correlated Equilibrium through strategic - type deep reinforcement learning algorithms under the condition of limited information, and this equilibrium is outside the convex hull of Nash equilibrium. This means that, compared with the traditional Nash equilibrium, the Correlated Equilibrium can enable participants to obtain higher returns, but at the same time, it also faces some challenges, such as how to design public signals to guide participants to reach the Correlated Equilibrium, and how to achieve this without fully understanding the opponents' payoff vectors. ### Core problems of the paper 1. **How to reach the Correlated Equilibrium in non - cooperative games**: - Many existing studies rely on strict assumptions, such as knowing the opponents' private behaviors, which may be unrealistic in practical applications. - The author proposes a method based on deep reinforcement learning to estimate the opponents' payoff vectors by learning their behaviors, thereby reaching the Correlated Equilibrium. 2. **How to design public signals**: - The concept of Correlated Equilibrium is based on all participants choosing actions according to public signals. However, how to design this public signal so that it can guide participants to reach the Correlated Equilibrium without leaking too much information is a key issue. - The author proposes a method to generate public signals through limited information interaction and to learn the meaning of these signals through a deep reinforcement learning model. 3. **How to estimate the opponents' payoff vectors under limited information**: - In order to reach the Correlated Equilibrium, participants need to know the opponents' payoff vectors. However, in actual games, this information is usually unknown. - The author proposes a mathematical model, combining the concept of Correlated Equilibrium and the idea of tension, to estimate the opponents' payoff vectors. ### Solutions - **Deep reinforcement learning model**: - The author designs a strategic - type deep reinforcement learning model, which can learn the joint distribution through interaction with the environment and finally reach the Correlated Equilibrium. - The inputs of the model include the current state and the previous state, and the output is the probability distribution of actions. - **Mathematical model**: - The author proposes a mathematical model, combining the definition of Correlated Equilibrium and the concept of tension, to estimate the opponents' payoff vectors. - By solving linear equations, maximize the participants' returns while satisfying the conditions of Correlated Equilibrium. ### Experimental verification - Through simulation experiments, the author shows that the proposed method can reach the optimal Correlated Equilibrium within a limited number of interactions, and these equilibria are outside the convex hull of Nash equilibrium, thus proving the effectiveness of the method. ### Summary The main contribution of this paper is to provide a method to reach the Correlated Equilibrium in non - cooperative games through deep reinforcement learning and limited information interaction. This method can not only increase the participants' returns, but also solve the problems of information opacity and public signal design in traditional methods.

Achieving Correlated Equilibrium by Studying Opponent's Behavior Through Policy-Based Deep Reinforcement Learning

Deep Reinforcement Learning for Nash Equilibrium of Differential Games

A Unified Perspective on Deep Equilibrium Finding

Monte Carlo Neural Fictitious Self-Play: Achieve Approximate Nash equilibrium of Imperfect-Information Games.

Efficient Competitive Self-Play Policy Optimization

Score-Based Equilibrium Learning in Multi-Player Finite Games with Imperfect Information

Playing Extensive Games with Learning of Opponent's Cognition

Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game Theory

Emergence of Cooperation in Two-agent Repeated Games with Reinforcement Learning

Reinforcement Nash Equilibrium Solver

Hierarchical Deep Reinforcement Learning Agent with Counter Self-play on Competitive Games

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Cooperative Equilibrium: A solution predicting cooperative play

Steering control of payoff-maximizing players in adaptive learning dynamics

Opponent Cart-Pole Dynamics for Reinforcement Learning of Competing Agents

Offline Equilibrium Finding

Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games

Nash Equilibria in the Response Strategy of Correlated Games

Opponent Modeling in Deep Reinforcement Learning

Understanding and Diagnosing Deep Reinforcement Learning