Abstract:Consider $N$ players each with a $d$-dimensional action set. Each of the players' utility functions includes their reward function and a linear term for each dimension, with coefficients that are controlled by the manager. We assume that the game is strongly monotone, so if each player runs gradient descent, the dynamics converge to a unique Nash equilibrium (NE). The NE is typically inefficient in terms of global performance. The resulting global performance of the system can be improved by imposing $K$-dimensional linear constraints on the NE. We therefore want the manager to pick the controlled coefficients that impose the desired constraint on the NE. However, this requires knowing the players' reward functions and their action sets. Obtaining this game structure information is infeasible in a large-scale network and violates the users' privacy. To overcome this, we propose a simple algorithm that learns to shift the NE of the game to meet the linear constraints by adjusting the controlled coefficients online. Our algorithm only requires the linear constraints violation as feedback and does not need to know the reward functions or the action sets. We prove that our algorithm, which is based on two time-scale stochastic approximation, guarantees convergence with probability 1 to the set of NE that meet target linear constraints. We then provide a mean square convergence rate of $O(t^{-1/4})$ for our algorithm. This is the first such bound for two time-scale stochastic approximation where the slower time-scale is a fixed point iteration with a non-expansive mapping. We demonstrate how our scheme can be applied to optimizing a global quadratic cost at NE and load balancing in resource allocation games. We provide simulations of our algorithm for these scenarios.

Game-theoretical control with continuous action sets

Cooperative Control and Potential Games

Learning in games with continuous action sets and unknown payoff functions

On Gradient-Based Learning in Continuous Games

Convergence of Decentralized Actor-Critic Algorithm in General-sum Markov Games

Penalty-Regulated Dynamics and Robust Learning Procedures in Games

Continuous control with deep reinforcement learning

Cooperative Path Following Control in Autonomous Vehicles Graphical Games: A Data-Based Off-Policy Learning Approach

Learning to Control Unknown Strongly Monotone Games

Game Theory and Control

Independent and Decentralized Learning in Markov Potential Games

A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

An efficient model‐free adaptive optimal control of continuous‐time nonlinear non‐zero‐sum games based on integral reinforcement learning with exploration

On convergence rates of game theoretic reinforcement learning algorithms

Multi-Action Restless Bandits with Weakly Coupled Constraints: Simultaneous Learning and Control

Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces

Energy-based Potential Games for Joint Motion Forecasting and Control

Generalized Bayesian Nash Equilibrium with Continuous Type and Action Spaces

Kinetin and naphtaleneacetic acid controlled starch formation in isolated roots ofPisunt sativum

On the Rate of Convergence of Continuous-Time Game Dynamics in N-Player Potential Games

Recent Developments in Machine Learning Methods for Stochastic Control and Games