Bandit Learning in Convex Non-Strictly Monotone Games

Tatiana Tatarenko,Maryam Kamgarpour
2023-08-16
Abstract:We address learning Nash equilibria in convex games under the payoff information setting. We consider the case in which the game pseudo-gradient is monotone but not necessarily strictly monotone. This relaxation of strict monotonicity enables application of learning algorithms to a larger class of games, such as, for example, a zero-sum game with a merely convex-concave cost function. We derive an algorithm whose iterates provably converge to the least-norm Nash equilibrium in this setting. {From the perspective of a single player using the proposed algorithm, we view the game as an instance of online optimization}. Through this lens, we quantify the regret rate of the algorithm and provide an approach to choose the algorithm's parameters to minimize the regret rate.
Optimization and Control
What problem does this paper attempt to address?