Flocking Control of UAV Swarms with Deep Reinforcement Leaming Approach

Peng Yan,Chengchao Bai,Hongxing Zheng,Jifeng Guo
DOI: https://doi.org/10.1109/icus50048.2020.9274899
2020-01-01
Abstract:The flocking control of UAV swarms has been studied extensively due to its wide applications. In this paper, the UAV flocking control problem is formulated as a Partial Observable Markov Decision Process (POMDP) where the constraints of the UAV's communication and perception ranges are considered. A deep reinforcement learning approach is proposed to solve this problem with centralized training and decentralized execution manner. The experience collected by all UAVs is used to train the shared flocking control policy, and each UAV performs actions based on the local environment information it observes. To enable the UAV swarm to maintain a flock and navigate in an environment with dense obstacles, a reward function is constructed considering with goal reaching, obstacles avoidance and flocking maintenance. Especially, the flocking maintenance reward is designed with the global information of the UAV swarm, which can only be obtained during the training phase. Simulation results demonstrate that the policy trained with the flocking maintenance reward can make the UAV swarm keep a flock when encountering obstacles and has good generalization ability with different number of UAVs.
What problem does this paper attempt to address?