Abstract:The partial observability and stochasticity in multi-agent settings can be mitigated by accessing more information about others via communication. However, the coordination problem still exists since agents cannot communicate actual actions with each other at the same time due to the circular dependencies. In this paper, we propose a novel multi-level communication scheme, Sequential Communication (SeqComm). SeqComm treats agents asynchronously (the upper-level agents make decisions before the lower-level ones) and has two communication phases. In the negotiation phase, agents determine the priority of decision-making by communicating hidden states of observations and comparing the value of intention, obtained by modeling the environment dynamics. In the launching phase, the upper-level agents take the lead in making decisions and then communicate their actions with the lower-level agents. Theoretically, we prove the policies learned by SeqComm are guaranteed to improve monotonically and converge. Empirically, we show that SeqComm outperforms existing methods in various cooperative multi-agent tasks.

What problem does this paper attempt to address?

This paper attempts to address the coordination problem in Multi-Agent Reinforcement Learning (MARL). Specifically, due to partial observability and randomness, agents in a multi-agent environment may experience miscoordination when making decisions, as they might take suboptimal actions based on assumptions about other agents' behaviors. Although communication can alleviate this issue, existing methods typically allow information exchange only within a synchronous decision-making framework, which cannot capture the actual actions of other agents, leading to a circular dependency problem. To this end, the paper proposes a novel multi-level communication scheme—Sequential Communication (SeqComm). SeqComm asynchronously processes the decision-making of agents and is divided into two communication phases: the negotiation phase and the initiation phase. In the negotiation phase, agents determine decision priorities through communication of hidden states; in the initiation phase, higher-priority agents make decisions first and inform lower-priority agents of their actual actions, thus achieving explicit coordination. The main contributions of the paper include: 1. **Asynchronous Decision Mechanism**: By asynchronously processing agents' decisions, it resolves the circular dependency problem in synchronous decision-making. 2. **Multi-Level Communication**: Introducing two communication phases, negotiation and initiation, allows agents to better coordinate their behaviors. 3. **Theoretical Guarantee**: It is proven that the policies learned by SeqComm can monotonically improve and converge. 4. **Empirical Results**: SeqComm outperforms existing non-communication and communication-based methods in several cooperative multi-agent tasks. In summary, by introducing the sequential communication mechanism, this paper effectively addresses the coordination problem in multi-agent environments and improves the cooperation efficiency among agents.

Multi-Agent Coordination via Multi-Level Communication

Multi-Agent Sequential Decision-Making via Communication

Coordination Control for a Class of Multi-Agent Systems under Asynchronous Switching

Multi-Agent Reinforcement Learning Control for Consensus Problems of Uncertain Nonlinear Multi-Agent Systems

Multi-agent Coordination Under Temporal Logic Tasks and Team-Wise Intermittent Communication

Team-wise effective communication in multi-agent reinforcement learning

Enhancing Multi-Agent Coordination through Common Operating Picture Integration

Communication Learning in Multi-Agent Systems from Graph Modeling Perspective

HiSA: Facilitating Efficient Multi-Agent Coordination and Cooperation by Hierarchical Policy with Shared Attention

Research on Multi-Agent Communication and Collaborative Decision-Making Based on Deep Reinforcement Learning

A Survey of Recent Progress in the Study of Distributed High-Order Linear Multi-Agent Coordination

A Role-Based POMDPs Approach for Decentralized Implicit Cooperation of Multiple Agents.

Communication Decision in Decentralized Control of Coordinated System

Learning Attentional Communication for Multi-Agent Cooperation

Modeling and simulation of complex network attributes on coordinating large multiagent system.

A Decentralized Communication Framework based on Dual-Level Recurrence for Multi-Agent Reinforcement Learning

Enhancing Collaboration in Heterogeneous Multiagent Systems Through Communication Complementary Graph

Decentralized Multi-Robot Collision Avoidance in Complex Scenarios With Selective Communication

Learning Multi-Agent Communication from Graph Modeling Perspective

Multi-Agent Incentive Communication via Decentralized Teammate Modeling

Coordination Scheme Probing for Generalizable Multi-Agent Reinforcement Learning