Abstract:Most existing multi-UAV collaborative search methods only consider scenarios of two-dimensional path planning or static target search. To be close to the practical scenario, this paper proposes a path planning method based on an action-mask-based multi-agent proximal policy optimization (AM-MAPPO) algorithm for multiple UAVs searching for moving targets in three-dimensional (3D) environments. In particular, a multi-UAV high–low altitude collaborative search architecture is introduced that not only takes into account the extensive detection range of high-altitude UAVs but also leverages the benefit of the superior detection quality of low-altitude UAVs. The optimization objective of the search task is to minimize the uncertainty of the search area while maximizing the number of captured moving targets. The path planning problem for moving target search in a 3D environment is formulated and addressed using the AM-MAPPO algorithm. The proposed method incorporates a state representation mechanism based on field-of-view encoding to handle dynamic changes in neural network input dimensions and develops a rule-based target capture mechanism and an action-mask-based collision avoidance mechanism to enhance the AM-MAPPO algorithm’s convergence speed. Experimental results demonstrate that the proposed algorithm significantly reduces regional uncertainty and increases the number of captured moving targets compared to other deep reinforcement learning methods. Ablation studies further indicate that the proposed action mask mechanism, target capture mechanism, and collision avoidance mechanism of the AM-MAPPO algorithm can improve the algorithm’s effectiveness, target capture capability, and UAVs’ safety, respectively.

Path Planning for Multi-UAV Based on Improved Proximal Policy Optimization Algorithm

Proximal Policy Optimization for Multi-rotor UAV Autonomous Guidance, Tracking and Obstacle Avoidance

On-policy Actor-Critic Reinforcement Learning for Multi-UAV Exploration

Multiple-UAV Reinforcement Learning Algorithm Based on Improved PPO in Ray Framework

Muti-Agent Proximal Policy Optimization For Data Freshness in UAV-assisted Networks

DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

Multi-UAV Path Planning and Following Based on Multi-Agent Reinforcement Learning

Path Planning of an Unmanned Aerial Vehicle Based on a Multi-Strategy Improved Pelican Optimization Algorithm

Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method

Path Planning for Unmanned Surface Vehicles with Strong Generalization Ability Based on Improved Proximal Policy Optimization

PPO-Exp: Keeping Fixed-Wing UAV Formation with Deep Reinforcement Learning

A Graph-Based PPO Approach in Multi-UAV Navigation for Communication Coverage

Proximal policy optimization with reciprocal velocity obstacle based collision avoidance path planning for multi-unmanned surface vehicles

Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment

Learning-Based UAV Coverage-Aware Path Planning in Large-scale Urban Environments

A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning

Mean policy-based proximal policy optimization for maneuvering decision in multi-UAV air combat

Reinforcement-Learning-Based Multi-UAV Cooperative Search for Moving Targets in 3D Scenarios

Deep Reinforcement Learning-based Collaborative Multi-UAV Coverage Path Planning

An Improved PPO for Multiple Unmanned Aerial Vehicles