Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding

He Jiang,Yutong Wang,Rishi Veerapaneni,Tanishq Duhan,Guillaume Sartoretti,Jiaoyang Li
2024-10-29
Abstract:Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still challenging for them to match the performance of the best search-based algorithms, especially in large-scale settings. This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques. Our proposed solver, Scalable Imitation Learning for LMAPF (SILLM), inherits the fast reasoning speed of learning-based methods and the high solution quality of search-based methods with the help of modern GPUs. Across six large-scale maps with up to 10,000 agents and varying obstacle structures, SILLM surpasses the best learning- and search-based baselines, achieving average throughput improvements of 137.7% and 16.0%, respectively. Furthermore, SILLM also beats the winning solution of the 2023 League of Robot Runners, an international LMAPF competition sponsored by Amazon Robotics. Finally, we validated SILLM with 10 real robots and 100 virtual robots in a mockup warehouse environment.
Multiagent Systems,Artificial Intelligence,Machine Learning,Robotics
What problem does this paper attempt to address?
The paper attempts to address the problem of efficiently managing a large number of robots in dynamic environments for Large-Scale Multi-Agent Path Finding (LMAPF), with the goal of maximizing throughput (i.e., the number of robots reaching their goals per time step). Specifically, the paper proposes an imitation learning-based approach called SILLM (Scalable Imitation Learning for LMAPF) to overcome the performance bottlenecks of existing methods when dealing with large-scale instances, especially in scenarios involving thousands or even tens of thousands of robots. ### Main Challenges: 1. **Scalability for Large-Scale Instances**: Existing learning methods perform well on small-scale instances but show significant performance degradation when handling large-scale instances. 2. **Real-Time and High Throughput**: In dynamic environments, robots need to frequently replan their paths, so the algorithm must have fast reasoning capabilities and high-quality solutions. 3. **Global Guidance and Local Collision Avoidance**: Effective global guidance and local collision avoidance strategies are crucial for avoiding congestion and improving throughput in large-scale multi-agent systems. ### Solution: - **Imitation Learning**: SILLM generates high-quality path planning by imitating a high-performance search algorithm (Windowed MAPF-LNS). - **Space-Sensitive Communication Module (SSC)**: A new communication architecture is introduced, emphasizing the use of spatial information to better learn efficient multi-agent cooperative behaviors. - **Global Guidance Techniques**: Combines three types of global guidance techniques (Backward Dijkstra, Static Guidance, Dynamic Guidance) to adapt to different types of map structures. - **Single-Step Collision Resolution**: An improved single-step collision resolution method (Learnable PIBT) is adopted to ensure collision-free paths. ### Experimental Results: - **Performance Improvement**: SILLM improves the average throughput on six large maps by 137.7% and 16.0% compared to the best learning method and search method, respectively. - **Real-Time**: SILLM can plan paths for 10,000 robots in less than 1 second per time step. - **Practical Application Verification**: SILLM was validated in a simulated warehouse environment using 10 real robots and 100 virtual robots, demonstrating its potential in practical applications. In summary, by proposing SILLM, the paper successfully addresses the key challenges in large-scale multi-agent path finding, providing strong support for future autonomous systems.