Research on trajectory planning based on reinforcement learning algorithm of deep deterministic policy gradient
Yang Youbo,Zhang Mu,Tang Jun,Lei Yinjie
DOI: https://doi.org/10.3969/j.issn.1007-1423.2023.05.001
2023-01-01
Abstract:Trajectory planning is an important part of UAV’s intelligent development. The existing traditional route planning algorithms have problems such as poor real-time planning ability, inability to handle dynamic scenes, and uneven tracks. Although the existing reinforcement learning algorithms can perform real-time planning, most are mainly applied in two-dimensional scenes,and there are problems such as easy collision with obstacles, low arrival rate, uneven tracks and low track quality. In view of the above problems, this paper proposed an algorithm based on reinforcement learning of improved deep deterministic policy gradient.The algorithm integrated self-attention mechanism, extracted the characteristics of obstacles, solved the problems of low arrival rate and poor real-time planning ability, redesigned the reward function, to punish the UAV’s“retreat”behavior, and introduced the direction vector angle guidance mechanism to solve the problem of track smoothness. The simulation results show that the improved algorithm achieves 93.5% arrival rate in complex dynamic scenes, the average flight distance is reduced by 7.3%, the reasoning time is reduced by 26.2%, the reasoning time is short, the track meets the flight requirements of UAV.