Underwater Target Tracking Based on Hierarchical Software-Defined Multi-AUV Reinforcement Learning: A Multi-AUV Advantage-Attention Actor-Critic Approach
Shengchao Zhu,Guangjie Han,Chuan Lin,Qiuzi Tao
DOI: https://doi.org/10.1109/TMC.2024.3437376
IF: 6.075
2024-12-01
IEEE Transactions on Mobile Computing
Abstract:With the rapid development of underwater robots, underwater communication techniques, etc., the Autonomous Underwater Vehicle (AUV) cluster network has emerged as a candidate paradigm to perform underwater civil and military applications, e.g., underwater target tracking. In this paper, we focus on how to utilize networking and multi-agent artificial intelligence technique to improve underwater target tracking. In particular, to improve the flexibility and scalability of the AUV cluster network, we employ Software-Defined Networking (SDN) and Centralized Training with Decentralized Execution (CTDE)-based Multi-Agent Reinforcement Learning (MARL) technologies, to propose a Hierarchical Software-Defined Multiple AUVs Reinforcement Learning (HSD-MARL) framework. For the MARL mechanism in HSD-MARL, we propose an advantage-attention mechanism and present the architecture of Multi-AUV Advantage-Attention Actor-Critic (MA-A3C), to address slow convergence and poor scalability issues on the AUV cluster network of large-scale. Further, to improve the utilization rate of advantage samples especially when the MA-A3C is utilized to perform AUV cluster network-based underwater tracking, we propose an ‘advantage resampling’ method based on experience replay buffer. Evaluation results showcase that our proposed approaches can perform exact underwater target tracking based on AUV cluster network systems and outperform some recent research products in terms of convergence speed, tracking accuracy, etc.
Engineering,Environmental Science,Computer Science