Abstract:Designing a safe and effective collision avoidance policy for multiple robots is essential in decentralized scenarios, where each robot is responsible for generating its own paths, to ensure their safe operation. Recently, the utilization of reinforcement learning to develop decentralized policies that enable multiple robots to move cooperatively and accomplish tasks has yielded positive outcomes. However, the presence of exploration unsafe actions during the reinforcement learning training process results in inadequate safety. We seek to enhance the safety of distributed multi-robot navigation policies and propose a new imitation learning framework based on the variational Bayesian model, which enables robots to learn safe actions by anticipating the subsequent state they are expected to reach. In addition, a new policy neural network structure for multi-robot navigation is proposed by introducing the transformer structure, which encodes the significance of nearby robots in relation to their forthcoming conditions. Experiments demonstrated that our policy can more safely guide robots to navigate in multi-robot environments under conditions of limited information, outperforming the state-of-the-art RL-RVO method in terms of success rate. Note to Practitioners —The motivation of this paper is to address the problem of collision avoidance in a multi-robot environment under limited information, which can also be applied to autonomous driving, crowd simulation, and other related fields. Positive outcomes have been observed in the utilization of reinforcement learning to create decentralized policies that enable multiple robots to move cooperatively and complete tasks. However, inadequate safety remains a challenging task due to the possibility of exploring hazardous actions during training. This article aims to enhance the safety of distributed policies guiding robots to accomplish navigation tasks in dynamic multi-robot environments. To begin with, we introduce a novel framework for imitation learning that is based on the variational Bayesian model. This framework facilitates the learning of safe actions by the policy to improve its performance and guide the robot in navigating and avoiding obstacles more securely. A loss function is proposed that enables the anticipation of the future state expected to be reached by the robot. By incorporating the transformer structure, a new neural network structure is designed for multi-robot navigation that encodes the significance of nearby robots concerning their upcoming conditions. This network structure employs a BiGRUs to facilitate the assimilation of observations from multiple agents by the policy. Compared to existing works such as GA3C-CADRL, SARL, and RL-RVO, our proposed method achieves a higher success rate. In our future research, we will investigate methods to enhance the policy’s performance in guiding robots to complete tasks by focusing on improving travel time and average speed, while also strictly ensuring safe navigation. Furthermore, we plan to extend this approach by addressing navigation challenges in more densely populated multi-robot environments.

Transformer-Based Imitative Reinforcement Learning for Multirobot Path Planning

Multi-Agent Path Finding Using Imitation-Reinforcement Learning with Transformer

Mapless Collaborative Navigation for a Multi-Robot System Based on the Deep Reinforcement Learning

Leveraging the Efficiency of Multi-Task Robot Manipulation Via Task-Evoked Planner and Reinforcement Learning

Transformer-Based Reinforcement Learning for Multi-Robot Autonomous Exploration

Toward Safe Distributed Multi-Robot Navigation Coupled with Variational Bayesian Model

Deep Reinforcement Learning with Multi-Critic TD3 for Decentralized Multi-Robot Path Planning

Mapless Path Planning of Multi-robot Systems in Complex Environments Via Deep Reinforcement Learning

Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning

An Enhanced Hierarchical Planning Framework for Multi-Robot Autonomous Exploration

Multi-agent policy learning-based path planning for autonomous mobile robots

A Method to Plan the Path of a Robot Utilizing Deep Reinforcement Learning and Multi-Sensory Information Fusion

Multi-objective Path Planning Based on Deep Reinforcement Learning

Multi-Robot Coverage Path Planning Based on Deep Reinforcement Learning.

Multi-robot Social-aware Cooperative Planning in Pedestrian Environments Using Multi-agent Reinforcement Learning

Robot path planning in dynamic environment based on reinforcement learning

A Decentralized Multi-Agent Path Planning Approach Based on Imitation Learning and Global Static Feature Extraction

Spatio-Temporal Transformer-Based Reinforcement Learning for Robot Crowd Navigation

Transformer Based Multi-Agent Framework

Hierarchical Large Scale Multirobot Path (Re)Planning

Multi-Robot Path Planning Method Using Reinforcement Learning