A New AGV Path Planning Method Based On PPO Algorithm

Haodong Wang,Jiangyu Hao,Wenhao Wu,Aipeng Jiang,Kai Mao,Yudong Xia
DOI: https://doi.org/10.23919/CCC58697.2023.10240661
2023-01-01
Abstract:In order to solve the problem that the traditional AGV path planning algorithm is difficult to deal with the complex and dynamic environment and some reinforcement learning algorithms have discrete actions and sparse rewards, this paper proposes an AGV path planning method based on the PPO (Proximal Policy Optimization) algorithm. This method firstly establishes an action decision-making system based on normal distribution through the output mean and variance of the neural network, which solves the problem of discrete output actions. Then, the method adds auxiliary rewards to the main line rewards to effectively alleviate the sparse rewards. Finally, a path planning method based on this algorithm is designed. In order to verify the performance of the proposed method, the method is compared and analyzed with the DQN algorithm in three simulation environments of Gazebo. The results show that the convergence speed and stability of the method based on the PPO algorithm are better than the DQN algorithm. The method based on the PPO algorithm has obvious advantages in the task completion time.
What problem does this paper attempt to address?