Group-Based Deep Reinforcement Learning in Multi-UAV Confrontation

Shengang Li,Baolai Wang,Tao Xie
DOI: https://doi.org/10.1007/978-3-030-92307-5_72
2021-01-01
Abstract:The application of deep reinforcement learning (DRL) algorithms in multi-agent environments has become more and more popular. However, most DRL algorithms do not solve the problem of group cooperation. Each agent explores in a direction that is beneficial to itself, but ignores the situation of its teammates, which is easy to fall into the local optimum. This paper aims to solve this problem in a multi-UAV confrontation scenario. We try to find the optimal cooperative policy by dividing UAVs into several groups and make UAVs learn to cooperate with teammates autonomously. Specifically, we propose an algorithm called group-based actor-critic (GBAC). We group UAVs by setting the observation radius, and we use a double Q network to process rewards. We divide rewards into individual rewards and group rewards. The Q network is used to process individual rewards, and the group-Q network is used to process group rewards. As a result, UAVs can get higher rewards through group cooperation. The performance of UAVs trained by our method outperforms other DRL methods. In this paper, we use the group-based DRL method to solve the problem of group cooperation and maximize the expected return in multi-UAV confrontation.
What problem does this paper attempt to address?