Multiagent Path Finding Using Deep Reinforcement Learning Coupled With Hot Supervision Contrastive Loss
Lin Chen,Yaonan Wang,Yang Mo,Zhiqiang Miao,Hesheng Wang,Mingtao Feng,Sifei Wang
DOI: https://doi.org/10.1109/tie.2022.3206745
IF: 7.7
2023-02-22
IEEE Transactions on Industrial Electronics
Abstract:Multiagent path finding (MAPF) is employed to find collision-free paths to guide agents traveling from an initial to a target position. The advanced decentralized approach utilizes communication between agents to improve their performance in environments with high-density obstacles. However, it dramatically reduces the robustness of multiagent systems. To overcome this difficulty, we propose a novel method for solving MAPF problems. In this method, expert data are transformed into supervised signals by proposing a hot supervised contrastive loss, which is combined with reinforcement learning to teach fully-decentralized policies. Agents reactively plan paths online in a partially observable world while exhibiting implicit coordination without communication with others. We introduce the self-attention mechanism in the policy network, which improves the ability of the policy network to extract collaborative information between agents from the observation data. By designing simulation experiments, we demonstrate that the learned policy achieved good performance without communication between agents. Furthermore, real-world application experiments demonstrate the effectiveness of our method in practical applications.
automation & control systems,engineering, electrical & electronic,instruments & instrumentation