Research on Multi-robot Path Planning Method Based on Improved MADDPG Algorithm

Peng Li,Siyu Jia,Zhengyang Cai
DOI: https://doi.org/10.1109/cac53003.2021.9728309
2021-01-01
Abstract:An improved multi-robot path planning method is proposed, aiming at the problems of slow training speed and low success rate of multi-robot path planning algorithm based on Multi-Agent Deep Deterministic Policy Gradient (MADDPG). First of all, a form of experience storage and replay called partitioned experience pool is designed to distinguish positive and negative experience by dividing the state space, and then some experience data are extracted by stratified random sampling. Secondly, in view of the fact that it is easy to ignore the value of different experience samples by using random sampling method, the samples in each experience pool are sorted according to the priority, so as to give priority to learning high quality experience. Thirdly, in order to solve the problem that choosing the experience with high priority will increase the training time, the storage form of the experience pool is changed to binary tree structure. Finally, two kinds of obstacle simulation environments with increasing difficulty are built on the three-dimensional simulation platform named Gazebo, and comparative experiments are set up for the improved algorithm and the original algorithm. Simulation results show that compared with the original algorithm, the training time of improved path planning algorithm is obviously reduced, the success rate in the two simulation environments is increased by 18% and 22% respectively, and the average path length in the two simulation environments is shortened by 12% and 17% respectively. This shows that the introduction of partitioned experience pool and priority experience replay mechanism makes the algorithm learn the optimal strategy faster and more effectively in the two environments established in this paper.
What problem does this paper attempt to address?