Abstract:Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still challenging for them to match the performance of the best search-based algorithms, especially in large-scale settings. This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques. Our proposed solver, Scalable Imitation Learning for LMAPF (SILLM), inherits the fast reasoning speed of learning-based methods and the high solution quality of search-based methods with the help of modern GPUs. Across six large-scale maps with up to 10,000 agents and varying obstacle structures, SILLM surpasses the best learning- and search-based baselines, achieving average throughput improvements of 137.7% and 16.0%, respectively. Furthermore, SILLM also beats the winning solution of the 2023 League of Robot Runners, an international LMAPF competition sponsored by Amazon Robotics. Finally, we validated SILLM with 10 real robots and 100 virtual robots in a mockup warehouse environment.

What problem does this paper attempt to address?

The paper attempts to address the problem of efficiently managing a large number of robots in dynamic environments for Large-Scale Multi-Agent Path Finding (LMAPF), with the goal of maximizing throughput (i.e., the number of robots reaching their goals per time step). Specifically, the paper proposes an imitation learning-based approach called SILLM (Scalable Imitation Learning for LMAPF) to overcome the performance bottlenecks of existing methods when dealing with large-scale instances, especially in scenarios involving thousands or even tens of thousands of robots. ### Main Challenges: 1. **Scalability for Large-Scale Instances**: Existing learning methods perform well on small-scale instances but show significant performance degradation when handling large-scale instances. 2. **Real-Time and High Throughput**: In dynamic environments, robots need to frequently replan their paths, so the algorithm must have fast reasoning capabilities and high-quality solutions. 3. **Global Guidance and Local Collision Avoidance**: Effective global guidance and local collision avoidance strategies are crucial for avoiding congestion and improving throughput in large-scale multi-agent systems. ### Solution: - **Imitation Learning**: SILLM generates high-quality path planning by imitating a high-performance search algorithm (Windowed MAPF-LNS). - **Space-Sensitive Communication Module (SSC)**: A new communication architecture is introduced, emphasizing the use of spatial information to better learn efficient multi-agent cooperative behaviors. - **Global Guidance Techniques**: Combines three types of global guidance techniques (Backward Dijkstra, Static Guidance, Dynamic Guidance) to adapt to different types of map structures. - **Single-Step Collision Resolution**: An improved single-step collision resolution method (Learnable PIBT) is adopted to ensure collision-free paths. ### Experimental Results: - **Performance Improvement**: SILLM improves the average throughput on six large maps by 137.7% and 16.0% compared to the best learning method and search method, respectively. - **Real-Time**: SILLM can plan paths for 10,000 robots in less than 1 second per time step. - **Practical Application Verification**: SILLM was validated in a simulated warehouse environment using 10 real robots and 100 virtual robots, demonstrating its potential in practical applications. In summary, by proposing SILLM, the paper successfully addresses the key challenges in large-scale multi-agent path finding, providing strong support for future autonomous systems.

Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

Scaling Lifelong Multi-Agent Path Finding to More Realistic Settings: Research Challenges and Opportunities

Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together

SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding

Work Smarter Not Harder: Simple Imitation Learning with CS-PIBT Outperforms Large Scale Imitation Learning for MAPF

Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning

LNS2+RL: Combining Multi-agent Reinforcement Learning with Large Neighborhood Search in Multi-agent Path Finding

Lifelong Multi-Agent Path Finding in Large-Scale Warehouses

Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning

Crowd Perception Communication-Based Multi-Agent Path Finding with Imitation Learning

Improving Learnt Local MAPF Policies with Heuristic Search

Anytime Multi-Agent Path Finding via Machine Learning-Guided Large Neighborhood Search

A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation

CL-MAPF: Multi-Agent Path Finding for Car-Like robots with kinematic and spatiotemporal constraints

LLM A: Human in the Loop Large Language Models Enabled A Search for Robotics

MAP3F: a decentralized approach to multi-agent pathfinding and collision avoidance with scalable 1D, 2D, and 3D feature fusion

HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent Pathfinding

Attention-based Priority Learning for Limited Time Multi-Agent Path Finding.

Caching-Augmented Lifelong Multi-Agent Path Finding

Curriculum Learning Based Multi-Agent Path Finding for Complex Environments.