Multi-Agent Coordination via Multi-Level Communication

Ziluo Ding,Zeyuan Liu,Zhirui Fang,Kefan Su,Liwen Zhu,Zongqing Lu
2024-11-05
Abstract:The partial observability and stochasticity in multi-agent settings can be mitigated by accessing more information about others via communication. However, the coordination problem still exists since agents cannot communicate actual actions with each other at the same time due to the circular dependencies. In this paper, we propose a novel multi-level communication scheme, Sequential Communication (SeqComm). SeqComm treats agents asynchronously (the upper-level agents make decisions before the lower-level ones) and has two communication phases. In the negotiation phase, agents determine the priority of decision-making by communicating hidden states of observations and comparing the value of intention, obtained by modeling the environment dynamics. In the launching phase, the upper-level agents take the lead in making decisions and then communicate their actions with the lower-level agents. Theoretically, we prove the policies learned by SeqComm are guaranteed to improve monotonically and converge. Empirically, we show that SeqComm outperforms existing methods in various cooperative multi-agent tasks.
Multiagent Systems,Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the coordination problem in Multi-Agent Reinforcement Learning (MARL). Specifically, due to partial observability and randomness, agents in a multi-agent environment may experience miscoordination when making decisions, as they might take suboptimal actions based on assumptions about other agents' behaviors. Although communication can alleviate this issue, existing methods typically allow information exchange only within a synchronous decision-making framework, which cannot capture the actual actions of other agents, leading to a circular dependency problem. To this end, the paper proposes a novel multi-level communication scheme—Sequential Communication (SeqComm). SeqComm asynchronously processes the decision-making of agents and is divided into two communication phases: the negotiation phase and the initiation phase. In the negotiation phase, agents determine decision priorities through communication of hidden states; in the initiation phase, higher-priority agents make decisions first and inform lower-priority agents of their actual actions, thus achieving explicit coordination. The main contributions of the paper include: 1. **Asynchronous Decision Mechanism**: By asynchronously processing agents' decisions, it resolves the circular dependency problem in synchronous decision-making. 2. **Multi-Level Communication**: Introducing two communication phases, negotiation and initiation, allows agents to better coordinate their behaviors. 3. **Theoretical Guarantee**: It is proven that the policies learned by SeqComm can monotonically improve and converge. 4. **Empirical Results**: SeqComm outperforms existing non-communication and communication-based methods in several cooperative multi-agent tasks. In summary, by introducing the sequential communication mechanism, this paper effectively addresses the coordination problem in multi-agent environments and improves the cooperation efficiency among agents.