Abstract:Unmanned aerial vehicle (UAV) swarms have found extensive applications in various fields,playing a crucial role in cluster collaboration. These swarms involve multiple UAVs that work together to achieve common objectives. A key challenging task in swarm operations is collision-free formation control of UAVs. To solve this problem,applying deep reinforcement learning methods has received significant attention,but their application on autonomous UAVs poses challenges,including dependency on global information during training,difficulties in sampling,and excessive resource utilization. To overcome these challenges,in this work,a novel approach based on multi-agent deep reinforcement learning (MARL) is proposed for collision-free formation control of UAV swarms. MARL allows each UAV to interact with a dynamic environment that includes other UAVs,enabling collaborative decision-making and adaptive behavior. We focus on leveraging local information to establish a state space for individual UAVs. To train the policy network,we employ the multi-agent proximal policy optimization (MAPPO) algorithm,allowing robust learning and policy optimization in a multi-agent setting. Also,we address the issues of sampling difficulties and resource constraints by utilizing digital twin technology,serving as a bridge between physical entities and virtual models,which offers a novel approach to the intelligent collaborative control of drone swarms. By establishing models in virtual space,digital twin technology enables the simulation of real-world spaces for pre-training the reinforcement learning algorithm by generating synthetic experiences. We construct multiple digital twin environments to facilitate interactive sampling and pre-train the swarm with basic task capabilities. Then,we supplement the training using real-world data collected in actual environments,enhancing the ability of the swarm to perform optimally in real-world scenarios. To evaluate the effectiveness of our approach,we compare the performance of the two-stage training architecture with other policy algorithms. To validate the sample efficiency of the on-policy algorithm MAPPO,we conducted a comparative analysis with other policy algorithms,particularly off-policy algorithms. The results reveal the superior sample efficiency and stability of MAPPO in addressing the challenges of collision-free formation control. Finally,we conduct a real-flight validation test to validate the practicality and reliability of the strategy model derived from the digital twin environments. Overall,this work demonstrates the effectiveness of our proposed approach in enabling UAV swarms to navigate complex environments and achieve collision-free formation control.

Digital Twin-Based Obstacle Avoidance Method for Unmanned Aerial Vehicle Formation Control Using Deep Reinforcement Learning

Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning

High-Sample-Efficient Multiagent Reinforcement Learning for Navigation and Collision Avoidance of UAV Swarms in Multitask Environments

A Formation Maintenance and Reconstruction Method of UAV Swarm Based on Distributed Control with Obstacle Avoidance

Local Pursuit Strategy-Inspired Cooperative Formation Flight and Collision Avoidance for UAV Cluster

A Reinforcement Learning-based Decentralized Method of Avoiding Multi-UAV Collision in 3-D Airspace

Collision-Avoiding Flocking With Multiple Fixed-Wing UAVs in Obstacle-Cluttered Environments: A Task-Specific Curriculum- Based MADRL Approach

Reinforcement Learning-Based Swarm Control for UAVs in Static and Dynamic Multi-Obstacle Environments

Multi-UAV Behavior-based Formation with Static and Dynamic Obstacles Avoidance via Reinforcement Learning

Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

Event-triggered Formation Control with Obstacle Avoidance for Multi-Agent Systems Applied to Multi-Uav Formation Flying

An Attention Mechanism and Adaptive Accuracy Triple-Dependent MADDPG Formation Control Method for Hybrid UAVs

Digital Twin-Enabled Decision-Making Framework for Multi-UAV Mission Planning: A Multiagent Deep Reinforcement Learning Perspective

Distributed Circle Formation Control for Quadrotors Based on Multi-agent Deep Reinforcement Learning

Autonomous and cooperative control of UAV cluster with multi-agent reinforcement learning

Distributed UAV Swarm Formation and Collision Avoidance Strategies over Fixed and Switching Topologies

Pigeon-inspired Formation Control for Multi-UAVs under Obstacle Environments

UAV Swarm Air Combat Maneuver Decision-Making Method Based on Multi-Agent Reinforcement Learning and Transferring

PASCAL: PopulAtion-Specific Curriculum-based MADRL for collision-free flocking with large-scale fixed-wing UAV swarms

Swarm Intelligence in Collision-free Formation Control for Multi-UAV Systems with 3D Obstacle Avoidance Maneuvers