Abstract:Multi-Agent Path Finding (MAPF) is a critical component of logistics and warehouse management, which focuses on planning collision-free paths for a team of robots in a known environment. Recent work introduced a novel MAPF approach, LNS2, which proposed to repair a quickly-obtainable set of infeasible paths via iterative re-planning, by relying on a fast, yet lower-quality, priority-based planner. At the same time, there has been a recent push for Multi-Agent Reinforcement Learning (MARL) based MAPF algorithms, which let agents learn decentralized policies that exhibit improved cooperation over such priority planning, although inevitably remaining slower. In this paper, we introduce a new MAPF algorithm, LNS2+RL, which combines the distinct yet complementary characteristics of LNS2 and MARL to effectively balance their individual limitations and get the best from both worlds. During early iterations, LNS2+RL relies on MARL for low-level re-planning, which we show eliminates collisions much more than a priority-based planner. There, our MARL-based planner allows agents to reason about past and future/predicted information to gradually learn cooperative decision-making through a finely designed curriculum learning. At later stages of planning, LNS2+RL adaptively switches to priority-based planning to quickly resolve the remaining collisions, naturally trading-off solution quality and computational efficiency. Our comprehensive experiments on challenging tasks across various team sizes, world sizes, and map structures consistently demonstrate the superior performance of LNS2+RL compared to many MAPF algorithms, including LNS2, LaCAM, and EECBS, where LNS2+RL shows significantly better performance in complex scenarios. We finally experimentally validate our algorithm in a hybrid simulation of a warehouse mockup involving a team of 100 (real-world and simulated) robots.

Attention-Cooperated Reinforcement Learning for Multi-agent Path Planning

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

Learning Attention-Based Strategies to Cooperate for Multi-Agent Path Finding

Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together

MAPPO method based on attention behavior network

Multi-Robot Path Planning Combining Heuristics and Multi-Agent Reinforcement Learning

Multi-robot Social-aware Cooperative Planning in Pedestrian Environments Using Multi-agent Reinforcement Learning

Multiagent Path Finding Using Deep Reinforcement Learning Coupled With Hot Supervision Contrastive Loss

Multi-Agent Path Finding Method Based on Evolutionary Reinforcement Learning

Multi-robot social-aware cooperative planning in pedestrian environments using attention-based actor-critic

Learning Efficient Multi-Agent Cooperative Visual Exploration

Cooperative Reward Shaping for Multi-Agent Pathfinding

Attention-based Priority Learning for Limited Time Multi-Agent Path Finding.

Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning

A Decentralized Multi-Agent Path Planning Approach Based on Imitation Learning and Selective Communication

Multi-agent collaborative path planning algorithm with reinforcement learning and combined prioritized experience replay in Internet of Things

Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning

MAP3F: a decentralized approach to multi-agent pathfinding and collision avoidance with scalable 1D, 2D, and 3D feature fusion

LNS2+RL: Combining Multi-agent Reinforcement Learning with Large Neighborhood Search in Multi-agent Path Finding

Intent-based Deep Reinforcement Learning for Multi-agent Informative Path Planning

Multi-agent policy learning-based path planning for autonomous mobile robots