Learning Effective Communication for Cooperative Pursuit with Multi-Agent Reinforcement Learning

Yubang Deng,Xianghui Cao,Qinmin Yang
DOI: https://doi.org/10.1109/cac57257.2022.10055979
2022-01-01
Abstract:In a multi-agent environment, agents need to frequently communicate with each other. However, due to resource constraints such as limited bandwidth, they may not be able to keep broadcasting messages. Therefore, agents are required to analyze the importance of their local observations and broadcast their messages only when necessary. To address these issues, we propose a multi-agent reinforcement learning algorithm with communication called Proximal Policy Optimization with Gated Attention Communication (PPO-GAC). It follows the Centralized Training and Decentralized Execution (CTDE) framework and is capable of deciding when to broadcast and how to handle messages received from other agents. Furthermore, we evaluate our algorithm in a multi-agent pursuit task. The simulation result shows that pursuers with PPO-GAC have the best performance in capturing all evaders compared to other baseline algorithms.
What problem does this paper attempt to address?