Abstract:Due to their high flexibility, low cost, and ease of handling, Unmanned Aerial Vehicles (UAVs) are often used to perform difficult tasks in complex environments. Stable and reliable path planning capability is the fundamental demand for UAVs to accomplish their flight tasks. Most researches on UAV path planning are carried out under the premise of known environmental information, and it is difficult to safely reach the target position in the face of unknown environment. Thus, an autonomous collision-free path planning algorithm for UAVs in unknown complex environments (APPA-3D) is proposed. An anti-collision control strategy is designed using the UAV collision safety envelope, which relies on the UAV's environmental awareness capability to continuously interact with external environmental information. A dynamic reward function of reinforcement learning combined with the actual flight environment is designed and an optimized reinforcement learning action exploration strategy based on the action selection probability is proposed. Then, an improved RL algorithm is used to simulate the UAV flight process in unknown environment, and the algorithm is trained by interacting with the environment, which finally realizes autonomous collision-free path planning for UAVs. The comparative experimental results in the same environment show that APPA-3D can effectively guide the UAV to plan a safe and collision-free path from the starting point to the target point in an unknown complex 3D environment.

What problem does this paper attempt to address?

This paper aims to solve the problem of autonomous path planning of unmanned aerial vehicles (UAVs) in unknown and complex environments. Specifically, most of the existing UAV path planning research is carried out on the premise of known environmental information, which makes it difficult for UAVs to reach the target location safely when facing unknown environments. Therefore, this paper proposes an algorithm (APPA - 3D) for collision - free autonomous path planning in unknown and complex environments. ### The main contributions of the paper include: 1. **Designed the anti - collision control strategy for UAVs**: - Utilized the environmental perception ability of UAVs and designed an anti - collision safety envelope, which triggers different anti - collision strategies based on the distance between the UAV and obstacles. - Drew on the near - mid - air collision rules (NMAC) of civil aircraft and the International Regulations for Preventing Collisions at Sea (COLREGS) and proposed four different anti - collision strategies to deal with different types of dynamic obstacles. 2. **Optimized the reward function generation mechanism of reinforcement learning (RL)**: - Combined with the artificial potential field method (APF), designed a dynamic reward function that can generate dynamic rewards in real - time according to the actual flight environment information of UAVs, solving the problem of difficult convergence of traditional RL algorithms in high - dimensional spaces. 3. **Proposed an RL exploration strategy based on action selection probability**: - Aimed at the "exploration - exploitation" dilemma faced by RL in the path planning process, proposed an RL exploration strategy based on action selection probability. This strategy dynamically adjusts the action selection strategy by combining the magnitude of the value function in different states, thereby improving the efficiency of path search. ### Specific technical details: - **Anti - collision safety envelope**: - Defined three regions: safety zone (SZ), collision avoidance zone (CZ) and mandatory collision avoidance zone (MZ). When an obstacle enters these regions, the UAV will take corresponding anti - collision measures. - Mathematical representation is as follows: - \( D_{\text{max}}\): the maximum detection distance of the sensor. - \( D_{\text{cz}}\): the threshold of the collision avoidance zone. - \( D_{\text{mz}}\): the threshold of the mandatory collision avoidance zone. - **Dynamic reward function**: - Utilized the APF method to define the gravitational potential field function and the repulsive potential field function, representing the attraction of the target point and the repulsion of obstacles respectively. - The mathematical expressions of the potential field functions are: \[ U_{\text{att}}(X)=\frac{1}{2}k_{\text{att}}\|X - X_g\|^2 \] \[ U_{\text{rep}}(X)=\begin{cases} \frac{1}{2}k_{\text{rep}}\left(\frac{1}{\|X - X_o\|}-\frac{1}{D_{\text{safe}}}\right)^2 & \text{if }\|X - X_o\|<D_{\text{safe}}\\ 0 & \text{otherwise} \end{cases} \] - The total potential field function is: \[ U(X)=U_{\text{att}}(X)+\sum_{i}U_{\text{rep}}(X, X_{o_i}) \] - **RL exploration strategy**: - Proposed an exploration strategy based on action selection probability, which improves the convergence speed and path search efficiency of the algorithm by dynamically adjusting the proportion of exploration and exploitation. - The mathematical expression is: \[ \pi(a|s)=\begin{cases} \epsilon/|A|+(1 - \epsilon)\cdot\frac{e^{Q(s,a)/T}}{\sum_{a'\in A}e^{Q(s,a')/T}} & \text{if }\tex

APPA-3D: an autonomous 3D path planning algorithm for UAVs in unknown complex environments

Efficient and High Path Quality Autonomous Exploration and Trajectory Planning of UAV in an Unknown Environment

A Motion Camouflage-Inspired Path Planning Method for UAVs Based on Reinforcement Learning

An Improved Artificial Potential Field Based Path Planning Algorithm for Unmanned Aerial Vehicle in Dynamic Environments

Dynamic Obstacle Avoidance Path Planning Of Uavs

Autonomous localized path planning algorithm for UAVs based on TD3 strategy

Path Planning of Unmanned Aerial Vehicle in Complex Environments Based on State-Detection Twin Delayed Deep Deterministic Policy Gradient

An Autonomous Path Planning Method for Unmanned Aerial Vehicle Based on a Tangent Intersection and Target Guidance Strategy

An Accurate UAV 3-D Path Planning Method for Disaster Emergency Response Based on an Improved Multiobjective Swarm Intelligence Algorithm

An Intelligent UAV Path-Planning Method Based on the Theory of the Three-Dimensional Subdivision of Earth Space

Collision-free path planning of Unmanned Aerial robots based on A* algorithm

Overview of Research on 3D Path Planning Methods for Rotor UAV

3D path planning for UAV based on A hybrid algorithm of marine predators algorithm with quasi-oppositional learning and differential evolution

UAV Mission Path Planning Based on Reinforcement Learning in Dynamic Environment

An Anti-Disturbance Resilience Enhanced Algorithm for UAV 3D Route Planning

UAV path planning and collision avoidance in 3D environments based on POMPD and improved grey wolf optimizer

Three-dimensional path planning for unmanned aerial vehicles based on fluid flow

Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning

Real-Time Path Planning Based on the Situation Space of UCAVs in a Dynamic Environment

Path Planning for Autonomous Vehicles in Unknown Dynamic Environment Based on Deep Reinforcement Learning

3D real-time dynamic path planning for UAV based on improved interfered fluid dynamical system and artificial neural network