Learning to Coordinate for a Worker-Station Multi-robot System in Planar Coverage Tasks

Jingtao Tang,Yuan Gao,Tin Lun Lam
DOI: https://doi.org/10.1109/LRA.2022.3214446
2022-08-24
Abstract:For massive large-scale tasks, a multi-robot system (MRS) can effectively improve efficiency by utilizing each robot's different capabilities, mobility, and functionality. In this paper, we focus on the multi-robot coverage path planning (mCPP) problem in large-scale planar areas with random dynamic interferers in the environment, where the robots have limited resources. We introduce a worker-station MRS consisting of multiple workers with limited resources for actual work, and one station with enough resources for resource replenishment. We aim to solve the mCPP problem for the worker-station MRS by formulating it as a fully cooperative multi-agent reinforcement learning problem. Then we propose an end-to-end decentralized online planning method, which simultaneously solves coverage planning for workers and rendezvous planning for station. Our method manages to reduce the influence of random dynamic interferers on planning, while the robots can avoid collisions with them. We conduct simulation and real robot experiments, and the comparison results show that our method has competitive performance in solving the mCPP problem for worker-station MRS in metric of task finish time.
Robotics,Artificial Intelligence,Multiagent Systems
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the multi - robot coverage path planning (mCPP) problem in the multi - robot system (MRS) in a large - scale planar area with randomly dynamic obstacles, especially when the robot resources are limited. Specifically, the research focuses on the worker - station MRS composed of multiple worker robots performing actual work and one station robot with sufficient resources for resource replenishment. The goal is to solve this problem by modeling the mCPP problem as a fully cooperative multi - agent reinforcement learning (MARL) problem and proposing an end - to - end decentralized online planning method, which simultaneously solves the coverage planning of workers and the rendezvous planning of stations. In addition, this method aims to reduce the impact of randomly dynamic obstacles on the planning while enabling robots to avoid collisions with these obstacles. The main contributions of the paper include: 1. Proposing an end - to - end decentralized online planning method to address the challenges of the worker - station MRS in the mCPP problem. This method can reduce the impact of randomly dynamic obstacles on the planning while ensuring that robots can effectively avoid obstacles. 2. Designing a two - stage curriculum learning method, combined with an intrinsic curiosity module and a soft approximation of worker energy constraints, which successfully guides the training process of large - scale coverage tasks. 3. Providing ablation study, simulation and real - robot experiment results, demonstrating that the proposed method is superior to decomposition - based and graph - based benchmark methods in terms of the coverage completion time metric. By introducing deep reinforcement learning (DRL) techniques, especially through the central training and decentralized execution (CTDE) paradigm, the paper overcomes the limitations of traditional methods in dealing with dynamic obstacles and multi - robot coordination. In this way, not only the task completion efficiency is improved, but also the robustness and adaptability of the system are enhanced.