Abstract:Imitation learning (IL) aims at achieving optimal actions by learning from demonstrated behaviors without knowing the reward function and transition kernels. Conducting IL with a large population of agents is challenging as agents' interactions grow exponentially with respect to the population size. Mean field theory provides an efficient tool to study multi-agent problems by aggregating information on the population level. While the approximation is tractable, it is non-trivial to restore mean field Nash equilibria (MFNE) from demonstrations. Importantly, there are many real-world problems that cannot be explained by the classic MFNE concept; this includes the traffic network equilibrium induced from the public routing recommendations and the pricing equilibrium of goods generated on the E-commerce platform. In both examples, correlated devices are introduced to the equilibrium due to the intervention from the platform. To accommodate this, we propose a novel solution concept named adaptive mean field correlated equilibrium (AMFCE) that generalizes MFNE. On the theory side, we first prove the existence of AMFCE, and establish a novel framework based on IL and AMFCE with entropy regularization (MaxEnt-AMFCE) to recover the AMFCE policy from real-world demonstrations. Signatures from the rough path theory are then applied to characterize the mean-field evolution. A significant benefit of MaxEnt-AMFCE is that it can recover both the equilibrium policy and the correlation device from data. We test our MaxEnt-AMFCE against the state-of-the-art IL algorithms for MFGs on several tasks (including a real-world traffic flow prediction problem), results justify the effectiveness of our proposed method and show its potential to predicting and explaining large population behavior under correlated signals.

Imitation Learning From Inconcurrent Multi-Agent Interactions

Mean Field Correlated Imitation Learning

Multi-Modal Imitation Learning in Partially Observable Environments

Imitation Learning for Mean Field Games with Correlated Equilibria

Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems

Bayesian Multi-type Mean Field Multi-agent Imitation Learning

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

CEIL: Generalized Contextual Imitation Learning

Learning Strategy Representation for Imitation Learning in Multi-Agent Games

Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Competitive Games

Imitation Learning from Suboptimal Demonstrations via Meta-Learning An Action Ranker

Efficient Off-policy Adversarial Imitation Learning with Imperfect Demonstrations

Multi-Agent Imitation Learning: Value is Easy, Regret is Hard

Mutual-Information Regularized Multi-Agent Policy Iteration.

Imitator Learning: Achieve Out-of-the-Box Imitation Ability in Variable Environments

M2I2: Learning Efficient Multi-Agent Communication via Masked State Modeling and Intention Inference

State-only Imitation with Transition Dynamics Mismatch

Co-Imitation Learning without Expert Demonstration

Imitation Learning via Simultaneous Optimization of Policies and Auxiliary Trajectories

A Game-Theoretic Learning Model in Multi-Agent Systems

Good Better Best: Self-Motivated Imitation Learning for Noisy Demonstrations