Abstract:Path planning is one of the most essential parts of task planning. However, multiple unmanned aerial vehicles (UAVs) path planning is a challenge when considering the cooperativity of multiple UAVs and the uncertainty of environments. This study proposed the novel task decomposed multi-agent twin delayed deep deterministic policy gradient (TD-MATD3) algorithm that enables UAVs execute path planning in complex multiple obstacles environments. TD-MATD3 improves upon the multi-agent twin delayed deep deterministic policy gradient (MATD3) algorithm by decomposing path planning task into the navigation task module for flying to the target and the obstacle avoidance task module for avoiding obstacles and other UAVs. Specifically, TD-MATD3 decomposes the Actor-Critic network structure of MATD3 into two corresponding parts according to the reward functions of two task modules. And the navigation features output by the Actor-Critic network of the navigation task module are input to the Actor-Critic network of the obstacle avoidance task module to guide UAVs to complete the overall path planning task. A novel reward function is also proposed to facilitate convergence of the algorithm. Experimental results indicate that TD-MATD3 can effectively accelerate convergence and enhance convergence effect during the training process, and it achieves a higher success rate in complex dynamic environments than multi-agent deep deterministic policy gradient (MADDPG) and MATD3 for multi-UAV path planning problem.

Dynamic Multi-Agent Deep Deterministic Policy Gradient for Autonomous Navigation of Reconfigurable Unmanned Aerial Vehicle

Mapless Collaborative Navigation for a Multi-Robot System Based on the Deep Reinforcement Learning

Implantable cardioverter defibrillators after acute myocardial infarction

Novel task decomposed multi-agent twin delayed deep deterministic policy gradient algorithm for multi-UAV autonomous path planning

Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement Learning

Autonomous Navigation of UAV in Large-Scale Unknown Complex Environment with Deep Reinforcement Learning.

Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach

Autonomous obstacle avoidance of UAV based on deep reinforcement learning

3M-RL: Multi-Resolution, Multi-Agent, Mean-Field Reinforcement Learning for Autonomous UAV Routing

DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments

Path Planning of Unmanned Aerial Vehicle in Complex Environments Based on State-Detection Twin Delayed Deep Deterministic Policy Gradient

Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle Coordination by Multi-Critic Policy Gradient Optimization

Deep-reinforcement-learning-based UAV autonomous navigation and collision avoidance in unknown environments

Multiple unmanned aerial vehicle coordinated strikes against ground targets based on an improved multi-agent deep deterministic policy gradient algorithm

Multi-UAV Path Planning and Following Based on Multi-Agent Reinforcement Learning

Target tracking strategy using deep deterministic policy gradient

A Reinforcement Learning-based Decentralized Method of Avoiding Multi-UAV Collision in 3-D Airspace

Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs

Multi-USV Dynamic Navigation and Target Capture: A Guided Multi-Agent Reinforcement Learning Approach

Deep Reinforcement Learning-Driven UAV Data Collection Path Planning: A Study on Minimizing AoI

Multi-Uav Automatic Dynamic Obstacle Avoidance With Experience-Shared A2c