Reciprocal Velocity Obstacle Spatial-Temporal Network for Distributed Multirobot Navigation
Lin Chen,Yaonan Wang,Zhiqiang Miao,Mingtao Feng,Zhen Zhou,Hesheng Wang,Danwei Wang
DOI: https://doi.org/10.1109/tie.2024.3379630
IF: 7.7
2024-01-01
IEEE Transactions on Industrial Electronics
Abstract:The core of multirobot collision avoidance lies in developing a decentralized policy that can guide robots from their initial positions to target locations based on the environment states perceived by the robots and ensure collision avoidance. However, the current multirobot collision avoidance policy network is challenging to simultaneously extract the global spatial state, temporal state, and reciprocity among robots, which limits its performance. In this work, we have developed a novel reciprocal velocity obstacle (RVO) spatial-temporal network and employed the proximal policy optimization algorithm to train the network parameters during interactions with amultirobot simulation environment. Specifically, a temporal state encoder module, utilized to represent the temporal characteristics of observation sequence data, is designed and achieved through the combination of the graph attention mechanism and the transformer encoding module. Furthermore, we design a reciprocal spatial state encoder module achieved through the use of a transformer encoding module to merge feature data from long short-term memory (LSTM), GRU, and bidirectional gated recurrent units (BiGRUs) branches, serving the purpose of representing spatial characteristics in RVO sequence data. Extensive simulation experiments demonstrate that our proposed method outperforms the state-of-the-art distributed policy reinforcement learning (RL)-RVO. We further conducted physical experiments using three Crazyflie quadcopter drones, illustrating its ability to effectively guide agents’ movements and avoid collisions.