Abstract:Designing a safe and effective collision avoidance policy for multiple robots is essential in decentralized scenarios, where each robot is responsible for generating its own paths, to ensure their safe operation. Recently, the utilization of reinforcement learning to develop decentralized policies that enable multiple robots to move cooperatively and accomplish tasks has yielded positive outcomes. However, the presence of exploration unsafe actions during the reinforcement learning training process results in inadequate safety. We seek to enhance the safety of distributed multi-robot navigation policies and propose a new imitation learning framework based on the variational Bayesian model, which enables robots to learn safe actions by anticipating the subsequent state they are expected to reach. In addition, a new policy neural network structure for multi-robot navigation is proposed by introducing the transformer structure, which encodes the significance of nearby robots in relation to their forthcoming conditions. Experiments demonstrated that our policy can more safely guide robots to navigate in multi-robot environments under conditions of limited information, outperforming the state-of-the-art RL-RVO method in terms of success rate. Note to Practitioners —The motivation of this paper is to address the problem of collision avoidance in a multi-robot environment under limited information, which can also be applied to autonomous driving, crowd simulation, and other related fields. Positive outcomes have been observed in the utilization of reinforcement learning to create decentralized policies that enable multiple robots to move cooperatively and complete tasks. However, inadequate safety remains a challenging task due to the possibility of exploring hazardous actions during training. This article aims to enhance the safety of distributed policies guiding robots to accomplish navigation tasks in dynamic multi-robot environments. To begin with, we introduce a novel framework for imitation learning that is based on the variational Bayesian model. This framework facilitates the learning of safe actions by the policy to improve its performance and guide the robot in navigating and avoiding obstacles more securely. A loss function is proposed that enables the anticipation of the future state expected to be reached by the robot. By incorporating the transformer structure, a new neural network structure is designed for multi-robot navigation that encodes the significance of nearby robots concerning their upcoming conditions. This network structure employs a BiGRUs to facilitate the assimilation of observations from multiple agents by the policy. Compared to existing works such as GA3C-CADRL, SARL, and RL-RVO, our proposed method achieves a higher success rate. In our future research, we will investigate methods to enhance the policy’s performance in guiding robots to complete tasks by focusing on improving travel time and average speed, while also strictly ensuring safe navigation. Furthermore, we plan to extend this approach by addressing navigation challenges in more densely populated multi-robot environments.

Attention-based Value Classification Reinforcement Learning for Collision-free Robot Navigation

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

A safe reinforcement learning approach for autonomous navigation of mobile robots in dynamic environments

Adaptive Environment Modeling Based Reinforcement Learning for Collision Avoidance in Complex Scenes

SSRL: A Safe and Smooth Reinforcement Learning Approach for Collision Avoidance in Navigation

Collision-Free Robot Navigation in Crowded Environments using Learning based Convex Model Predictive Control

Real-Time Navigation In Dynamic Human Environments Using Optimal Reciprocal Collision Avoidance

A Soft Actor-Critic Deep Reinforcement-Learning-Based Robot Navigation Method Using LiDAR

Training Is Execution: A Reinforcement Learning-Based Collision Avoidance Algorithm for Volatile Scenarios

Reinforcement Learned Distributed Multi-Robot Navigation With Reciprocal Velocity Obstacle Shaped Rewards

Robot Navigation with Entity-Based Collision Avoidance using Deep Reinforcement Learning

Multigoal Visual Navigation With Collision Avoidance via Deep Reinforcement Learning

Toward Safe Distributed Multi-Robot Navigation Coupled with Variational Bayesian Model

A Deep Safe Reinforcement Learning Approach for Mapless Navigation.

Collision Anticipation via Deep Reinforcement Learning for Visual Navigation

Efficient Multi-agent Navigation with Lightweight DRL Policy

Autonomous navigation of mobile robots in unknown environments using off-policy reinforcement learning with curriculum learning

Distributed Multi-Robot Collision Avoidance Via Deep Reinforcement Learning for Navigation in Complex Scenarios

Deep-Reinforcement-Learning-Based Collision Avoidance of Autonomous Driving System for Vulnerable Road User Safety

Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios

The Multi-Dimensional Actions Control Approach for Obstacle Avoidance Based on Reinforcement Learning