Multi-USV Cooperative Chasing Strategy Based on Obstacles Assistance and Deep Reinforcement Learning
Wenhao Gan,Xiuqing Qu,Dalei Song,Peng Yao
DOI: https://doi.org/10.1109/tase.2023.3319510
2024-01-01
Abstract:How to use environmental information to improve the cooperative chasing efficiency of multiple unmanned surface vehicles (multi-USV) is a problem worthy of attention. This article considers an archipelago scenario and proposes an obstacle-assisted chasing framework based on reinforcement learning, in which a multi-USV system chases a smart evader through autonomous environmental awareness and target state prediction. When there are obstacles near the evader, the chasing group can use them to quickly complete the chase. Otherwise, within the range of perception, the group can drive the evader closer to the obstacles or directly cooperate to catch the evader. At the same time, the problem of group credit allocation is also considered, so as to achieve a relatively stable encircling structure with the least number of vehicles. Simulation experiments in the multi-obstacle environment demonstrate the flexibility of the proposed framework. Compared with the traditional force-based and learning-based methods in untrained archipelago environments, the model trained by this framework is proven to have higher efficiency while ensuring real-time performance and generalization. Note to Practitioners —The motivation of this paper is to provide guidance to the USV team in learning how to collaboratively chase and capture target through multi-agent reinforcement learning. Meanwhile, this method can also be applied to other chasing scenarios for vehicles, such as ground robots. This paper proposes a new method that leverages the MA-POCA learning framework and RSA mechanism to enhance cooperation and credit allocation among USVs. Additionally, a novel reward mechanism is designed to encourage USVs to fully utilize the advantages of the surrounding environment and accelerate cooperative chasing. With this method, multiple USVs can effectively pursue intelligent evaders through autonomous environmental perception and target state prediction. To validate the proposed method, we developed a high-fidelity simulation environment using Unity3D, incorporating USV dynamics, irregular obstacles, and sea surface disturbances. Furthermore, our method demonstrates low computational cost, making it suitable for practical USV navigation and control applications in the future.