Safe Multi-Agent Learning Control for Unmanned Surface Vessels Cooperative Interception Mission

Bin Du,Gang Liu,Wei Xie,Weidong Zhang
DOI: https://doi.org/10.1109/icarm54641.2022.9959180
2022-01-01
Abstract:This paper aims to develop a multi-agent reinforcement learning control method for the unmanned surface vessels (USVs) interception mission. A safe proximal policy optimization (SPPO) method is developed to control defender vessels blocking the invading ship. SPPO employs a joint state-value function to strengthen the cooperation between defender vessels. A safety constraint is also introduced into action and state value estimation to reduce irresponsible actions, and we develop a safety constraint on motion boundaries and convert the safety problem into a constraint in the learning-based control method. Simulation results on defender USVs cooperative control ultimately illustrate the effectiveness of the safe proximal policy optimization method. With high performance in reward, SPPO can control USVs achieving cooperative interception missions.
What problem does this paper attempt to address?