Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Yang Cai,Gabriele Farina,Julien Grand-Clément,Christian Kroer,Chung-Wei Lee,Haipeng Luo,Weiqiang Zheng
2023-11-02
Abstract:Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties of regret-matching algorithms. Given the importance of last-iterate convergence for numerical optimization reasons and relevance as modeling real-word learning in games, in this paper, we study the last-iterate convergence properties of various popular variants of RM$^+$. First, we show numerically that several practical variants such as simultaneous RM$^+$, alternating RM$^+$, and simultaneous predictive RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ game. We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence: we prove that extragradient RM$^{+}$ and smooth Predictive RM$^+$ enjoy asymptotic last-iterate convergence (without a rate) and $1/\sqrt{t}$ best-iterate convergence. Finally, we introduce restarted variants of these algorithms, and show that they enjoy linear-rate last-iterate convergence.
Computer Science and Game Theory,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is related to the last - iterate convergence of the Regret Matching (RM+ ) algorithm and its variants in games. Specifically, the research objectives include: 1. **Explore the last - iterate behavior of existing RM+ and its variants**: The paper shows through numerical experiments that several popular RM+ variants (such as Simultaneous RM+, Alternating RM+ and Predictive RM+ ) lack last - iterate convergence in simple 3×3 games. 2. **Prove the last - iterate convergence of specific RM+ variants**: The paper proves that recently proposed RM+ variants based on smoothing techniques (such as Extragradient RM+ and Smooth Predictive RM+ ) have last - iterate convergence. Specifically: - ExRM+ and SPRM+ have asymptotic last - iterate convergence (without rate). - The optimal iteration convergence rate is \( O\left(\frac{1}{\sqrt{t}}\right) \). 3. **Introduce a restart mechanism to achieve a linear convergence rate**: The paper introduces restart variants (such as Restart ExRM+ and Restart SPRM+ ) and proves that these variants can achieve a linear last - iterate convergence rate. ### Main problem summary The core problem of this paper is to explore and verify the last - iterate convergence properties of RM+ and its variants when solving zero - sum games. Compared with traditional algorithms such as gradient descent - ascent, RM+ and its variants are very popular in practical applications, but there is less theoretical understanding of their last - iterate convergence. Therefore, this paper aims to fill this theoretical gap and propose improved algorithm variants to ensure better convergence performance. ### Key contributions 1. **Numerical evidence**: Provide numerical evidence that RM+ and its important variants (such as Alternating RM+ and Predictive RM+ ) may not asymptotically converge in the last - iterate. 2. **Theoretical proof**: Prove the asymptotic last - iterate convergence of ExRM+ and SPRM+ and give the convergence rate of the optimal iteration. 3. **Restart mechanism**: Propose ExRM+ and SPRM+ under the restart mechanism and prove their linear last - iterate convergence rate. 4. **Positive results under strict assumptions**: Prove that under the restrictive conditions of strict Nash equilibrium, RM+ does have last - iterate convergence. Through these works, the paper provides an important theoretical basis and practical guidance for understanding and improving regret - matching - based algorithms.