Fast Adaptation to External Agents Via Meta Imitation Counterfactual Regret Advantage.

Mingyue Zhang,Zhi Jin,Yang Xu,Zehan Shen,Kun Liu,Keyu Pan
DOI: https://doi.org/10.5555/3463952.3464209
2021-01-01
Autonomous Agents and Multi-Agent Systems
Abstract:This paper focuses on the multi-agent credit assignment problem. We propose a novel multi-agent reinforcement learning algorithm called meta imitation counterfactual regret advantage (MICRA) and a three-phase framework for training, adaptation, and execution of MICRA. The key features are: (1) a counterfactual regret advantage is proposed to optimize the target agents’ policy; (2) a meta-imitator is designed to infer the external agents’ policies. Results show that MICRA outperforms state-of-the-art algorithms.
What problem does this paper attempt to address?