Multi-agent navigation based on deep reinforcement learning and traditional pathfinding algorithm

Hongda Qiu
DOI: https://doi.org/10.48550/arXiv.2012.09134
2020-12-05
Abstract:We develop a new framework for multi-agent collision avoidance problem. The framework combined traditional pathfinding algorithm and reinforcement learning. In our approach, the agents learn whether to be navigated or to take simple actions to avoid their partners via a deep neural network trained by reinforcement learning at each time step. This framework makes it possible for agents to arrive terminal points in abstract new scenarios. In our experiments, we use Unity3D and Tensorflow to build the model and environment for our scenarios. We analyze the results and modify the parameters to approach a well-behaved strategy for our agents. Our strategy could be attached in different environments under different cases, especially when the scale is large.
Multiagent Systems,Machine Learning,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively avoid collisions in multi - agent navigation while ensuring that agents can reach the target points in an optimized way. Specifically, the paper proposes a new framework that combines traditional path - planning algorithms and reinforcement - learning methods, aiming to solve the collision - avoidance problem in multi - agent environments. Through this method, each agent can learn through a deep neural network at each time step whether it should be navigated or take simple actions to avoid collisions with other agents. This framework enables agents to successfully reach the destination in new abstract scenarios, especially showing good performance and robustness in large - scale environments. The main contributions of the paper are as follows: 1. **Combining traditional path - planning and reinforcement - learning**: A new framework that combines traditional path - planning algorithms and reinforcement - learning is proposed to solve the multi - agent collision - avoidance problem. 2. **Robustness and efficiency**: The experimental results show that this framework is more robust in different environments, has higher navigation efficiency, and the trained policies can be flexibly applied in different environments without the need for retraining. 3. **Independent decision - making**: Although the training process is centralized, each agent can make independent decisions in actual operation, reducing the dependence on centralized systems. Through these methods, the paper aims to improve the navigation ability and safety of multi - agent systems in complex environments.