Multi-agent Reinforcement Learning for a Special Formation Problem

Qu Changsheng,Ke Liangjun,Xuan Shuzhe,Wang Zhigang
DOI: https://doi.org/10.1007/978-981-19-3998-3_53
2022-01-01
Abstract:For a long time, the formation control problem has been one of the core problems in the field of multi-agent collaboration. It’s goal is to make multiple agents form a formation in the tasks and move to a designated target point. In this paper, the reinforcement learning method is used to deal with a special formation problem, and the mean field theory is applied to Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm to make it effective in large-scale multi-agent formation problems. The algorithm is compared with MADDPG and Deep Deterministic Policy Gradient (DDPG) algorithm. In the simulation experiment, a team of UAVs are initialized at random positions in the two-dimensional space. The goal is to make the UAVs form an equilateral triangle with the shortest total displacement. We model the formation problem. In our algorithm, a reward function is designed based on the goal of making equilateral triangle formation and requirement of the shortest total displacement; Moreover, we define the equilateral triangle criterion and use it to evaluate the formation effect of multi-agent reinforcement learning. The results show that, compared with DDPG and MADDPG algorithms, MADDPG algorithm using the mean field method has obvious advantages over MADDPG and DDPG in terms of convergence speed and success rate in the formation problem.
What problem does this paper attempt to address?