UAVs rounding up inspired by communication multi-agent depth deterministic policy gradient

Longting Jiang,Ruixuan Wei,Dong Wang
DOI: https://doi.org/10.1007/s10489-022-03986-3
IF: 5.3
2022-09-07
Applied Intelligence
Abstract:UAVs rounding up is a game between UAV swarm and targets. The main challenge lies in achieving efficient collaboration between UAVs and the setting of rounding-up points. This paper extends our work in three aspects, including establishing an information interaction strategy model, dynamic rounding-up points, and detailed reward function settings. Inspired by the intelligence of the biological swarm, this paper constructs a communication multi-agent depth deterministic policy gradient (COM-MADDPG) framework, based on the communication topology during the rounding-up process, which proposes an information interaction strategy as action policy in reinforcement learning. When carrying out rounding up, it is no longer limited to a fixed threshold, and a dynamic rounding-up points is proposed to judge the success of the mission, each UAV has own area of rounding-up and cooperate to complete the swarm mission. In view of the situation where the target is at the corner or edge, the reward function of reinforcement learning is redefined, which effectively avoids the problem of rounding-up failure under special circumstances. Furthermore, the simulation results verify the COM-MADDPG framework perform better than DDPG and MADDPG in rounding-up tasks, and can be help for improving the success rate, which confirms the effectiveness of decision-making in those special situations. Those all have shown promise due to their robustness.
computer science, artificial intelligence
What problem does this paper attempt to address?