Abstract:Multiple unmanned aerial vehicles (Multi-UAV) systems have recently demonstrated significant advantages in some real-world scenarios, but the limited communication range of UAVs poses great challenges to multi-UAV collaborative decision-making. By constructing the multi-UAV cooperation problem as a multi-agent system (MAS), the cooperative decision-making among UAVs can be realized by using multi-agent reinforcement learning (MARL). Following this paradigm, this work focuses on developing partially observable MARL models that capture important information from local observations in order to select effective actions. Previous related studies employ either probability distributions or weighted mean field to update the average actions of neighborhood agents. However, they do not fully consider the feature information of surrounding neighbors, resulting in a local optimum often. In this paper, we propose a novel partially multi-agent reinforcement learning algorithm to remedy this flaw, which is based on graph attention network and partially observable mean field and is named as the GPMF algorithm for short. GPMF uses a graph attention module and a mean field module to describe how an agent is influenced by the actions of other agents at each time step. The graph attention module consists of a graph attention encoder and a differentiable attention mechanism, outputting a dynamic graph to represent the effectiveness of neighborhood agents against central agents. The mean field module approximates the effect of a neighborhood agent on a central agent as the average effect of effective neighborhood agents. Aiming at the typical task scenario of large-scale multi-UAV cooperative roundup, the proposed algorithm is evaluated based on the MAgent framework. Experimental results show that GPMF outperforms baselines including state-of-the-art partially observable mean field reinforcement learning algorithms, providing technical support for large-scale multi-UAV coordination and confrontation tasks in communication-constrained environments.

GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation

GAT-MF: Graph Attention Mean Field for Very Large Scale Multi-Agent Reinforcement Learning

A Graph-Based Soft Actor Critic Approach in Multi-Agent Reinforcement Learning

Group-Aware Coordination Graph for Multi-Agent Reinforcement Learning

Very Large Scale Multi-Agent Reinforcement Learning with Graph Attention Mean Field

A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning

Multiagent Reinforcement Learning With Heterogeneous Graph Attention Network

Attention Based Reinforcement Learning for Efficient Communication under Constraint in Multi-Agent Systems

Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph Attention Network for UAV Swarms

Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph-Attention

GCMA: an Adaptive Multi-Agent Reinforcement Learning Framework with Group Communication for Complex and Similar Tasks Coordination

Cooperative multi-agent game based on reinforcement learning

Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Cooperative Policy Learning with Pre-trained Heterogeneous Observation Representations

Graph-based Multi-Agent Reinforcement Learning for Collaborative Search and Tracking of Multiple UAVs

Cascaded Attention: Adaptive and Gated Graph Attention Network for Multiagent Reinforcement Learning

Scaling Up Multiagent Reinforcement Learning for Robotic Systems: Learn an Adaptive Sparse Communication Graph

Multiagent Reinforcement Learning With Graphical Mutual Information Maximization

Scalable and Transferable Reinforcement Learning for Multi-Agent Mixed Cooperative–Competitive Environments Based on Hierarchical Graph Attention

Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning

GHGC: Goal-based Hierarchical Group Communication in Multi-Agent Reinforcement Learning.