Abstract:Path planning is one of the important components of the Unmanned Aerial Vehicle (UAV) mission, and it is also the key guarantee for the successful completion of the UAV's mission. The traditional path planning algorithm has certain limitations and deficiencies in the complex dynamic environment. Aiming at the dynamic complex obstacle environment, this paper proposes an improved TD3 algorithm, which enables the UAV to complete the autonomous path planning through online learning and continuous trial and error. The algorithm changes the experience pool of TD3 algorithm to priority experience replay, so that the agent can distinguish the importance of empirical samples, improve the sampling efficiency of the algorithm, and reduce the training time. The average TD3 is proposed, and the average value of is taken when the target value is updated to solve the problem of overestimating the value while avoiding underestimating the value, so that the improved algorithm has better stability and can adapt to various complex obstacle environments. A new reward function is set up, so that each step of the UAV action can receive reward feedback, which solves the problem of sparse reward in deep reinforcement learning. The experimental results show that this method can train the UAV to reach the target safely and quickly in a multi-obstacle environment. Compared with DDPG, SAC and traditional TD3, the path planning success rate of this algorithm is higher than that of the other three algorithms, and the collision rate is lower than that of the comparison algorithm, which has better path planning performance.

UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient

Path Planning of Unmanned Aerial Vehicle in Complex Environments Based on State-Detection Twin Delayed Deep Deterministic Policy Gradient

Novel task decomposed multi-agent twin delayed deep deterministic policy gradient algorithm for multi-UAV autonomous path planning

Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning

UAV Path Planning Employing MPC- Reinforcement Learning Method Considering Collision Avoidance

UAV Path Planning Based on the Average TD3 Algorithm With Prioritized Experience Replay

Deep Reinforcement Learning-Driven UAV Data Collection Path Planning: A Study on Minimizing AoI

Path Following for Autonomous Ground Vehicle Using DDPG Algorithm: A Reinforcement Learning Approach

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

A Motion Camouflage-Inspired Path Planning Method for UAVs Based on Reinforcement Learning

Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments

Path Planning in Complex Environments Using Attention-Based Deep Deterministic Policy Gradient

B-APFDQN: A UAV Path Planning Algorithm Based on Deep Q-Network and Artificial Potential Field

Efficient Path Planning for Mobile Robot Based on Deep Deterministic Policy Gradient

Multiple UAVs Path Planning Based on Deep Reinforcement Learning in Communication Denial Environment

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments

Maneuvering target tracking of UAV based on MN-DDPG and transfer learning

Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning

Multi-Agent Path Planning based on MPC and DDPG

Autonomous UAV Navigation: A DDPG-based Deep Reinforcement Learning Approach