Abstract:The core of multirobot collision avoidance lies in developing a decentralized policy that can guide robots from their initial positions to target locations based on the environment states perceived by the robots and ensure collision avoidance. However, the current multirobot collision avoidance policy network is challenging to simultaneously extract the global spatial state, temporal state, and reciprocity among robots, which limits its performance. In this work, we have developed a novel reciprocal velocity obstacle (RVO) spatial-temporal network and employed the proximal policy optimization algorithm to train the network parameters during interactions with amultirobot simulation environment. Specifically, a temporal state encoder module, utilized to represent the temporal characteristics of observation sequence data, is designed and achieved through the combination of the graph attention mechanism and the transformer encoding module. Furthermore, we design a reciprocal spatial state encoder module achieved through the use of a transformer encoding module to merge feature data from long short-term memory (LSTM), GRU, and bidirectional gated recurrent units (BiGRUs) branches, serving the purpose of representing spatial characteristics in RVO sequence data. Extensive simulation experiments demonstrate that our proposed method outperforms the state-of-the-art distributed policy reinforcement learning (RL)-RVO. We further conducted physical experiments using three Crazyflie quadcopter drones, illustrating its ability to effectively guide agents’ movements and avoid collisions.

Sample Efficient Learning of Path Following and Obstacle Avoidance Behavior for Quadrotors

3D Path Planning of Quadrotor Aerial Robots Using Numerical Optimization

Efficient Learning-based Trajectory Tacker for Quadrotor at High-speed Flight

Nearest-Neighbor-based Collision Avoidance for Quadrotors via Reinforcement Learning

Deep Learning Quadcopter Control Via Risk-Aware Active Learning.

Path Planning and Following Control of a Quadrotor Helicopter in Three-Dimensional Space With Limited Information

MPCC-based Path Following Control for a Quadrotor with Collision Avoidance Guaranteed in Constrained Environments.

Imitation Learning-Based Online Time-Optimal Control with Multiple-Waypoint Constraints for Quadrotors

Computationally Efficient Trajectory Planning for High Speed Obstacle Avoidance of a Quadrotor with Active Sensing.

Global Path Planning of Quadrotor Using Reinforcement Learning

A Lightweight Control Method for Fast and Agile Quadrotor Using NMPC-Imitation Learning

Deterministic Policy Gradient with Integral Compensator for Robust Quadrotor Control

Model-Predictive Control with Stochastic Collision Avoidance Using Bayesian Policy Optimization

Quadrotor Trajectory Planning for Visibility-Aware Target Following

Guidance & Control Networks for Time-Optimal Quadcopter Flight

Deep Reinforcement Learning-based Quadcopter Controller: A Practical Approach and Experiments

Learning a Single Near-hover Position Controller for Vastly Different Quadcopters.

Model Predictive Path Following Control of a Quadrotor in Constrained Environments

Aggressive and robust low-level control and trajectory tracking for quadrotors with deep reinforcement learning

Reciprocal Velocity Obstacle Spatial-Temporal Network for Distributed Multirobot Navigation

Nonlinear Model Predictive Control-Based Guidance Algorithm for Quadrotor Trajectory Tracking with Obstacle Avoidance