Abstract:The framework of multi-agent learning explores the dynamics of how individual agent strategies evolve in response to the evolving strategies of other agents. Of particular interest is whether or not agent strategies converge to well known solution concepts such as Nash Equilibrium (NE). Most "fixed order" learning dynamics restrict an agent's underlying state to be its own strategy. In "higher order" learning, agent dynamics can include auxiliary states that can capture phenomena such as path dependencies. We introduce higher-order gradient play dynamics that resemble projected gradient ascent with auxiliary states. The dynamics are "payoff based" in that each agent's dynamics depend on its own evolving payoff. While these payoffs depend on the strategies of other agents in a game setting, agent dynamics do not depend explicitly on the nature of the game or the strategies of other agents. In this sense, dynamics are "uncoupled" since an agent's dynamics do not depend explicitly on the utility functions of other agents. We first show that for any specific game with an isolated completely mixed-strategy NE, there exist higher-order gradient play dynamics that lead (locally) to that NE, both for the specific game and nearby games with perturbed utility functions. Conversely, we show that for any higher-order gradient play dynamics, there exists a game with a unique isolated completely mixed-strategy NE for which the dynamics do not lead to NE. These results build on prior work that showed that uncoupled fixed-order learning cannot lead to NE in certain instances, whereas higher-order variants can. Finally, we consider the mixed-strategy equilibrium associated with coordination games. While higher-order gradient play can converge to such equilibria, we show such dynamics must be inherently internally unstable.

Stability of the Nash Equilibrium under Gradient Ascent Learning Algorithms in Two-Agent Two-Action Games

Convergence of Policy Gradient Methods for Nash Equilibria in General-sum Stochastic Games

On Gradient-Based Learning in Continuous Games

Convergence of Learning Dynamics in Stackelberg Games

Deep Reinforcement Learning for Nash Equilibrium of Differential Games

Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games

Passivity-based Gradient-Play Dynamics for Distributed Generalized Nash Equilibrium Seeking

Learning generalized Nash equilibria in multi-agent dynamical systems via extremum seeking control

On the Stability of Learning in Network Games with Many Players

Gradient play in stochastic games: stationary points, convergence, and sample complexity

Gradient Dynamics in Linear Quadratic Network Games with Time-Varying Connectivity and Population Fluctuation

A unified stochastic approximation framework for learning in games

A Payoff-Based Policy Gradient Method in Stochastic Games with Long-Run Average Payoffs

Higher-Order Uncoupled Dynamics Do Not Lead to Nash Equilibrium -- Except When They Do

ε ‐Nash equilibrium of non‐cooperative Lagrangian dynamic games based on the average sub‐gradient robust integral sliding mode control

Towards convergence to Nash equilibria in two-team zero-sum games

Learning generalized Nash equilibria in monotone games: A hybrid adaptive extremum seeking control approach

A Policy-Gradient Approach to Solving Imperfect-Information Games with Iterate Convergence

On Passivity, Reinforcement Learning and Higher-Order Learning in Multi-Agent Finite Games

Distributed Nash equilibrium seeking strategies via bilateral bounded gradient approach

Learning to Control Unknown Strongly Monotone Games