Path planning for multiple agents in an unknown environment using soft actor critic and curriculum learning

Libo Sun,Jiahui Yan,Wenhu Qin
DOI: https://doi.org/10.1002/cav.2113
IF: 1.01
2022-08-17
Computer Animation and Virtual Worlds
Abstract:In the circle scenario and the scenario with some static obstacles, the model using our approach shows a good generalization performance without policy retraining. In the emergency scenario, the model using our approach can adapt to a new change in a dynamic environment without policy retraining. Path planning can guarantee that agents reach their goals without colliding with obstacles and other agents in an optimal way and it is a very important component in the research of crowd simulation. In this article, we propose a novel path planning approach for multiple agents which combines soft actor critic (SAC) algorithm and curriculum learning to solve the problems of single policy, slow convergence of the policy in an unknown environment with sparse rewards. The path planning task is set as lessons from easy to difficult, and the neural network of the SAC algorithm is arranged to learn in sequence, and finally the neural network can be fully competent for the path planning task. We also stack the state information to address the problems caused by limited observation for policy learning, and design a comprehensive reward function to make agents reach their goals successfully and avoid collisions with static obstacles and other agents. The experimental results demonstrate that our approach can plan smooth and natural paths for multiple agents, and furthermore, our model has a certain generalization ability and a better adaptability to the changes in a dynamic environment.
computer science, software engineering
What problem does this paper attempt to address?