Abstract:The mainstream Multi-Agent Reinforcement Learning (MARL) methods introduce the teammate modeling or the communication mechanism into Centralized Training Decentralized Execution (CTDE) paradigm, which can improve coordination performance. However, the existing teammate modeling methods predict either actions or local observations, limiting their applicability. In addition, the traditional communication mechanism only considers the quantity of the communication links while ignoring the quality of retained communication links, leading to inefficient and redundant communication. To solve the above problems, this paper proposes a novel Multi-Agent Cooperative Strategy with Explicit Teammate Modeling and Targeted Informative Communication (MACS), which can generate and send the more informative message with the higher communication efficiency, further improving the coordination performance. Specifically, the Variational Auto-Encoder (VAE) is leveraged to allow each agent to simultaneously predict the observations and actions of teammates, thus generating more comprehensive communication message. Then, we propose a new Mutual Information (MI) between the communication message and teammate Q-value, which can obtain the informative message, ensuring the exploration and stability of the method. In addition, a targeted dynamic informative communication graph is established by the Graph Neural Network (GNN) which can reduce the redundant communication link through hypothetical analysis, further improving the overall communication efficiency. Eventually, we conduct experiments in StarCraft II, Collaborative Navigation, and Multi-Target Multi-Sensor Coverage environments. Experimental results show that the proposed approach is superior to the state-of-the-art in terms of coordination performance and communication efficiency.

A Leader-Following Paradigm Based Deep Reinforcement Learning Method for Multi-Agent Cooperation Games

Learning Intra-group Cooperation in Multi-agent Systems.

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach

Achieving cooperation through deep multiagent reinforcement learning in sequential prisoner's dilemmas

Deep reinforcement learning algorithm based on multi-agent parallelism and its application in game environment

Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

A Cooperative Multi-Agent Reinforcement Learning Method Based on Coordination Degree

SCC-rFMQ: a Multiagent Reinforcement Learning Method in Cooperative Markov Games with Continuous Actions

A Multi-Agent Q-Learning with Value Function Approximation Based on Single-leader Multi-followers Stackelberg Game

Enhancing cooperation by cognition differences and consistent representation in multi-agent reinforcement learning

Cooperative Action Decision Based on State Perception Similarity for Deep Reinforcement Learning

Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation

A new multi-agent reinforcement learning approach

Coordination as inference in multi-agent reinforcement learning

Multi-Agent Evolutionary Reinforcement Learning Based on Cooperative Games

Multi-agent Cooperative Strategy with Explicit Teammate Modeling and Targeted Informative Communication

Consciousness-Aware Multi-Agent Reinforcement Learning

A Method for Multi-Agent Coordination Based on Distributed Reinforcement Learning

Relation-Aware Learning for Multi-Task Multi-Agent Cooperative Games

Learning Multi-Agent Cooperation via Considering Actions of Teammates