Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games

Kenshi Abe,Mitsuki Sakamoto,Kaito Ariu,Atsushi Iwasaki
2024-10-03
Abstract:This paper presents a payoff perturbation technique, introducing a strong convexity to players' payoff functions in games. This technique is specifically designed for first-order methods to achieve last-iterate convergence in games where the gradient of the payoff functions is monotone in the strategy profile space, potentially containing additive noise. Although perturbation is known to facilitate the convergence of learning algorithms, the magnitude of perturbation requires careful adjustment to ensure last-iterate convergence. Previous studies have proposed a scheme in which the magnitude is determined by the distance from a periodically re-initialized anchoring or reference strategy. Building upon this, we propose Gradient Ascent with Boosting Payoff Perturbation, which incorporates a novel perturbation into the underlying payoff function, maintaining the periodically re-initializing anchoring strategy scheme. This innovation empowers us to provide faster last-iterate convergence rates against the existing payoff perturbed algorithms, even in the presence of additive noise.
Computer Science and Game Theory
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of achieving **last - iterate convergence** in monotone games. Specifically, the author proposes a new perturbed gradient ascent algorithm (Gradient Ascent with Boosting Payoff Perturbation, GABP) to accelerate the speed of reaching Nash equilibrium in the presence of noisy feedback. #### Main problem background 1. **Learning in monotone games**: - In monotone games, the gradient of the payoff function is a monotone function in the strategy configuration space. Such games include many common game types, such as convex - concave games, zero - sum games, etc. - Typical learning algorithms (such as gradient ascent, multiplicative weight update) can only converge to Nash equilibrium in the sense of average iteration, which may be less than ideal in some application scenarios, such as when training generative adversarial networks or fine - tuning large - language models. 2. **The need for last - iterate convergence**: - Last - iterate convergence means that the updated strategy configuration itself directly converges to Nash equilibrium, rather than through averaging the results of multiple iterations. This convergence method is stronger than average - iteration convergence and is more suitable for practical applications. 3. **The impact of noisy feedback**: - In the real world, feedback information often contains noise, which places higher requirements on the performance of learning algorithms. Although some existing optimistic algorithms perform well in the absence of noise, they have a slow convergence speed under noisy feedback. #### Core contributions of the paper 1. **Introduction of enhanced perturbation techniques**: - A new perturbation technique is proposed, which stabilizes the learning process by introducing strong convexity and can maintain fast convergence in the presence of noise. 2. **Improved algorithm design**: - The GABP algorithm is designed. Based on the original Adaptive Perturbed Mirror Descent (APMD), an additional perturbation term is added. This additional term can accelerate the convergence speed and shows better performance under noisy feedback. 3. **Theoretical analysis and experimental verification**: - It is theoretically proven that the convergence rates of GABP under full feedback and noisy feedback are \(\tilde{O}(1/T)\) and \(\tilde{O}(1/T^{1/7})\) respectively, which are better than existing methods. - Experimental results show that GABP performs well in both random - payoff games and difficult convex - concave games, especially under noisy feedback conditions, its performance is significantly better than other algorithms. #### Summary This paper solves the problem of achieving last - iterate convergence in monotone games by introducing a new perturbation technique and significantly improves the convergence speed under noisy feedback conditions. This result provides important theoretical and technical support for the development of more efficient and more robust game - learning algorithms.