Abstract:Repeated games consider a situation where multiple agents are motivated by their independent rewards throughout learning. In general, the dynamics of their learning become complex. Especially when their rewards compete with each other like zero-sum games, the dynamics often do not converge to their optimum, i.e., the Nash equilibrium. To tackle such complexity, many studies have understood various learning algorithms as dynamical systems and discovered qualitative insights among the algorithms. However, such studies have yet to handle multi-memory games (where agents can memorize actions they played in the past and choose their actions based on their memories), even though memorization plays a pivotal role in artificial intelligence and interpersonal relationship. This study extends two major learning algorithms in games, i.e., replicator dynamics and gradient ascent, into multi-memory games. Then, we prove their dynamics are identical. Furthermore, theoretically and experimentally, we clarify that the learning dynamics diverge from the Nash equilibrium in multi-memory zero-sum games and reach heteroclinic cycles (sojourn longer around the boundary of the strategy space), providing a fundamental advance in learning in games.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how learning dynamics deviate from Nash equilibrium and reach heteroclinic cycles in multi - memory games. Specifically: 1. **Learning Algorithms in Multi - Memory Games**: The paper extends two major learning algorithms - Replicator Dynamics and Gradient Ascent - to make them applicable to multi - memory games. The author proves that the dynamics of these two algorithms are equivalent in multi - memory games. 2. **Uniqueness of Nash Equilibrium**: In a specific one - memory, two - action zero - sum game, the paper proves the uniqueness of Nash equilibrium and gives a specific formula: \[ x^*=\frac{-u_3 + u_4}{u_1 - u_2 - u_3 + u_4}, \quad y^*=\frac{-u_2 + u_4}{u_1 - u_2 - u_3 + u_4} \] 3. **Complexity of Learning Dynamics**: Through theoretical analysis and experimental verification, the paper shows the complex behavior of multi - memory learning dynamics when approaching Nash equilibrium. In particular, these dynamics often deviate from Nash equilibrium and form heteroclinic cycles, that is, staying longer near the boundary of the strategy space. 4. **Chaotic Phenomena**: The paper also finds that in multi - memory games, learning dynamics are highly sensitive to initial conditions, similar to the behavior of chaotic systems. This sensitivity makes even a tiny initial difference lead to significantly different learning trajectories. 5. **Divergence in General Cases**: Through numerical simulation, the paper further shows that in zero - sum games with different numbers of memories and actions, learning dynamics also deviate from Nash equilibrium and form heteroclinic cycles. In conclusion, this paper aims to deeply understand the learning dynamics in multi - memory games, especially in zero - sum games, how these dynamics deviate from the traditional Nash equilibrium, and the specific mechanisms and manifestations of this deviation. This research not only provides new theoretical insights but also reveals the complexity and challenges of learning dynamics in multi - memory games.

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games

Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry

Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibrium

Adaptive algorithm for multi-agent learning optimal cooperative pursuit strategy based on Markov game

Nash Equilibrium and Learning Dynamics in Three-Player Matching $m$-Action Games

Corrupted Learning Dynamics in Games

Learning in Games: a Systematic Review

Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions

Convergence of Heterogeneous Learning Dynamics in Zero-sum Stochastic Games

Nash Equilibrium in Iterated Multiplayer Games Under Asynchronous Best-Response Dynamics

The dynamics of competitive learning: the role of updates and memory

Convergence of Learning Dynamics in Stackelberg Games

Higher-Order Uncoupled Dynamics Do Not Lead to Nash Equilibrium -- Except When They Do

Is Learning in Games Good for the Learners?

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Learning in Multi-Player Stochastic Games

No-Regret Learning in Time-Varying Zero-Sum Games

Neural Auto-Curricula in Two-Player Zero-Sum Games.

Chaos persists in large-scale multi-agent learning despite adaptive learning rates

Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics