Model-Free Reinforcement Learning for Mean Field Games

Rajesh K. Mishra,S. Vishwanath,Deepanshu Vasal
DOI: https://doi.org/10.1109/TCNS.2023.3264934
IF: 4.347
2023-12-01
IEEE Transactions on Control of Network Systems
Abstract:In this article, we have proposed a model-free reinforcement learning (RL) algorithm, based on sequential decomposition, to obtain optimal policies for mean field games (MFGs). We consider finite horizon MFGs with a large population of homogeneous players, sequentially making strategic decisions. Each player observes a private state and a mean-field population state representing the empirical distribution of other players' states. The mean-field state is common information among all the players in the game. Vasal (2020) provided a sequential decomposition algorithm to compute mean field equilibrium for such games in linear time than exponential as in prior literature. We extended the idea of sequential decomposition to propose a model-free RL algorithm for these games using expected Sarsa (Mishra et al., 2020). In this article, we provide detailed convergence proofs for our algorithm. In addition, we propose an inverse reinforcement learning algorithm for MFGs with unknown reward functions. The proposed algorithm learns the reward function by studying an expert's behavior, and then computes the optimal policy. We illustrate our results using a cyber-physical security example.
Computer Science,Mathematics
What problem does this paper attempt to address?