Abstract:In many real-world cooperative multiagent reinforcement learning (MARL) tasks, teams of agents can rehearse together before deployment, but then communication constraints may force individual agents to execute independently when deployed. Centralized training and decentralized execution (CTDE) is increasingly popular in recent years, focusing mainly on this setting. In the value-based MARL branch, credit assignment mechanism is typically used to factorize the team reward into each individual’s reward — individual-global-max (IGM) is a condition on the factorization ensuring that agents’ action choices coincide with team’s optimal joint action. However, current architectures fail to consider local coordination within sub-teams that should be exploited for more effective factorization, leading to faster learning. We propose a novel value factorization framework, called multiagent Q-learning with sub-team coordination (QSCAN), to flexibly represent sub-team coordination while honoring the IGM condition. QSCAN encompasses the full spectrum of sub-team coordination according to sub-team size, ranging from the monotonic value function class to the entire IGM function class, with familiar methods such as QMIX and QPLEX located at the respective extremes of the spectrum. Experimental results show that QSCAN’s performance dominates state-of-the-art methods in matrix games, predator-prey tasks, the Switch challenge in MA-Gym. Additionally, QSCAN achieves comparable performances to those methods in a selection of StarCraft II micro-management tasks.

A Distributed Q-Learning Algorithm for Multi-Agent Team Coordination

Target-Value-Competition-Based Multi-Agent Deep Reinforcement Learning Algorithm for Distributed Nonconvex Economic Dispatch

Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together

Learning Intra-group Cooperation in Multi-agent Systems.

Multi-agent Deep Reinforcement Learning Algorithm for Distributed Economic Dispatch in Smart Grid.

Adaptive algorithm for multi-agent learning optimal cooperative pursuit strategy based on Markov game

Multi-Agent Determinantal Q-Learning

Multi-goal Q-learning of Cooperative Teams

Learning Effective Communication for Cooperative Pursuit with Multi-Agent Reinforcement Learning

Multiagent Q-learning with Sub-Team Coordination.

A Multi-Agent Multi-Environment Mixed Q-Learning for Partially Decentralized Wireless Network Optimization

Learning Multi-Agent Cooperation via Considering Actions of Teammates

Efficient off‐policy Q‐learning for multi‐agent systems by solving dual games

Learning Distributed Coordinated Policy in Catching Game with Multi-Agent Reinforcement Learning.

$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Multi-Agent Reinforcement Learning for Distributed Cooperative Targets Search

A Distributed Path Planning Algorithm via Reinforcement Learning

A multiagent reinforcement learning approach based on different states

I2Q: A Fully Decentralized Q-Learning Algorithm

Learning multiagent coordination in the absence of communication channels

Adaptive Individual Q-Learning-A Multiagent Reinforcement Learning Method for Coordination Optimization