Improving multi-target cooperative tracking guidance for UAV swarms using multi-agent reinforcement learning

ZHOU Wenhong,LI Jie,LIU Zhihong,SHEN Lincheng,Wenhong ZHOU,Jie LI,Zhihong LIU,Lincheng SHEN
DOI: https://doi.org/10.1016/j.cja.2021.09.008
IF: 5.7
2021-10-01
Chinese Journal of Aeronautics
Abstract:Multi-Target Tracking Guidance (MTTG) in unknown environments has great potential values in applications for Unmanned Aerial Vehicle (UAV) swarms. Although Multi-Agent Deep Reinforcement Learning (MADRL) is a promising technique for learning cooperation, most of the existing methods cannot scale well to decentralized UAV swarms due to their computational complexity or global information requirement. This paper proposes a decentralized MADRL method using the maximum reciprocal reward to learn cooperative tracking policies for UAV swarms. This method reshapes each UAV's reward with a regularization term that is defined as the dot product of the reward vector of all neighbor UAVs and the corresponding dependency vector between the UAV and the neighbors. And the dependence between UAVs can be directly captured by the Pointwise Mutual Information (PMI) neural network without complicated aggregation statistics. Then, the experience sharing Reciprocal Reward Multi-Agent Actor-Critic (MAAC-R) algorithm is proposed to learn the cooperative sharing policy for all homogeneous UAVs. Experiments demonstrate that the proposed algorithm can improve the UAVs’ cooperation more effectively than the baseline algorithms, and can stimulate a rich form of cooperative tracking behaviors of UAV swarms. Besides, the learned policy can better scale to other scenarios with more UAVs and targets.
engineering, aerospace
What problem does this paper attempt to address?