Cooperative Defense of Autonomous Surface Vessels with Quantity Disadvantage Using Behavior Cloning and Deep Reinforcement Learning

Siqing Sun,Tianbo Li,Xiao Chen,Huachao Dong,Xinjing Wang
DOI: https://doi.org/10.1016/j.asoc.2024.111968
IF: 8.7
2024-01-01
Applied Soft Computing
Abstract:Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.
What problem does this paper attempt to address?