Multi-agent air combat with two-stage graph-attention communication

Zhixiao Sun,Huahua Wu,Yandong Shi,Xiangchao Yu,Yifan Gao,Wenbin Pei,Zhen Yang,Haiyin Piao,Yaqing Hou
DOI: https://doi.org/10.1007/s00521-023-08784-7
2023-07-07
Neural Computing and Applications
Abstract:Air-to-air combat system is a complex multi-agent system (MAS) wherein a large number of unmanned combat aerial vehicles learn to combat with their opponents in a highly dynamic and uncertain environment. Because of the local observability of each individual, it is difficult for classical multi-agent learning methods to get effective cooperative strategies. Recently, a communication mechanism has been proposed to solve the local observability issue of MAS. However, existing methods with predefined rules easily cause an exponential increase in state–action pairs, leading to high communication costs. Taking this cue, this paper designs a graph neural network based on a two-stage graph-attention mechanism to capture the key interaction relationships and communication connections between agents in complex air-to-air combat scenarios. Based on an essential backbone multi-agent reinforcement learning method, known as Multi-Agent Proximal Policy Optimization, the proposed method with a hard- and soft-attention scheme can realize the dynamic adjustment of the communication relationship and ad hoc network of multiple agents, by cutting off the unrelated interaction connections while building the correlation importance between pair agents, concurrently. Last but not least, the experimental study in the simulation environment has validated the significance of our proposed method in solving the large-scale air-to-air combat problems.
computer science, artificial intelligence
What problem does this paper attempt to address?