Abstract:Unmanned aerial vehicle (UAV) swarms have found extensive applications in various fields,playing a crucial role in cluster collaboration. These swarms involve multiple UAVs that work together to achieve common objectives. A key challenging task in swarm operations is collision-free formation control of UAVs. To solve this problem,applying deep reinforcement learning methods has received significant attention,but their application on autonomous UAVs poses challenges,including dependency on global information during training,difficulties in sampling,and excessive resource utilization. To overcome these challenges,in this work,a novel approach based on multi-agent deep reinforcement learning (MARL) is proposed for collision-free formation control of UAV swarms. MARL allows each UAV to interact with a dynamic environment that includes other UAVs,enabling collaborative decision-making and adaptive behavior. We focus on leveraging local information to establish a state space for individual UAVs. To train the policy network,we employ the multi-agent proximal policy optimization (MAPPO) algorithm,allowing robust learning and policy optimization in a multi-agent setting. Also,we address the issues of sampling difficulties and resource constraints by utilizing digital twin technology,serving as a bridge between physical entities and virtual models,which offers a novel approach to the intelligent collaborative control of drone swarms. By establishing models in virtual space,digital twin technology enables the simulation of real-world spaces for pre-training the reinforcement learning algorithm by generating synthetic experiences. We construct multiple digital twin environments to facilitate interactive sampling and pre-train the swarm with basic task capabilities. Then,we supplement the training using real-world data collected in actual environments,enhancing the ability of the swarm to perform optimally in real-world scenarios. To evaluate the effectiveness of our approach,we compare the performance of the two-stage training architecture with other policy algorithms. To validate the sample efficiency of the on-policy algorithm MAPPO,we conducted a comparative analysis with other policy algorithms,particularly off-policy algorithms. The results reveal the superior sample efficiency and stability of MAPPO in addressing the challenges of collision-free formation control. Finally,we conduct a real-flight validation test to validate the practicality and reliability of the strategy model derived from the digital twin environments. Overall,this work demonstrates the effectiveness of our proposed approach in enabling UAV swarms to navigate complex environments and achieve collision-free formation control.

Decentralized Learning Control for Multi-UAV Swarm Simultaneous Coverage and Tracking

A Reinforcement Learning-based Decentralized Method of Avoiding Multi-UAV Collision in 3-D Airspace

Decentralized UAV Swarm Control: A Multi-Layered Architecture for Integrated Flight Mode Management and Dynamic Target Interception

Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

High-Sample-Efficient Multiagent Reinforcement Learning for Navigation and Collision Avoidance of UAV Swarms in Multitask Environments

Coverage Control for UAV Swarm Communication Networks: A Distributed Learning Approach

Joint Communication and Action Learning in Multi-Target Tracking of UAV Swarms with Deep Reinforcement Learning

Multi-UAV Cooperative Target Tracking Based on Swarm Intelligence

Reinforcement Learning-Based Swarm Control for UAVs in Static and Dynamic Multi-Obstacle Environments

Flocking Control of UAV Swarms with Deep Reinforcement Leaming Approach

UAV Swarm Deployment and Trajectory for 3D Area Coverage via Reinforcement Learning

Joint Optimization of Multi-UAV Deployment and User Association Via Deep Reinforcement Learning for Long-Term Communication Coverage

Digital Twin-Based Obstacle Avoidance Method for Unmanned Aerial Vehicle Formation Control Using Deep Reinforcement Learning

UAV Swarm Cooperative Target Search: A Multi-Agent Reinforcement Learning Approach

Distributed UAV Swarms for 3D Urban Area Coverage with Incomplete Information Using Event-Triggered Hierarchical Reinforcement Learning

Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and Learning Mean-Field Control

Multi-UAV Collaborative Detection Based on Reinforcement Learning.

Reinforcement Learning-Based Dynamic Coverage Control of Multi-Rotor UAVs with Safety Priority

Fully Decentralized Federated Learning-Based On-Board Mission for UAV Swarm System

A deep reinforcement learning based distributed multi-UAV dynamic area coverage algorithm for complex environment

Reinforcement Learning Based Two-Level Control Framework of UAV Swarm for Cooperative Persistent Surveillance in an Unknown Urban Area