COLERGs-constrained Safe Reinforcement Learning for Realising MASS's Risk-Informed Collision Avoidance Decision Making

Chengbo Wang,Xinyu Zhang,Hongbo Gao,Musa Bashir,Huanhuan Li,Zaili Yang
DOI: https://doi.org/10.1016/j.knosys.2024.112205
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:Maritime autonomous surface ship (MASS) represents a significant advancement in maritime technology, offering the potential for increased efficiency, reduced operational costs, and enhanced maritime traffic safety. However, MASS navigation in complex maritime traffic and congested water areas presents challenges, especially in Collision Avoidance Decision Making (CADM) during multi-ship encounter scenarios. Through a robust risk assessment design for time-sequential and joint-target ships (TSs) encounter scenarios, a novel risk and reliability critic-enhanced safe hierarchical reinforcement learning (RA-SHRL), constrained by the International Regulations for Preventing Collisions at Sea (COLREGs), is proposed to realize the autonomous navigation and CADM of MASS. Finally, experimental simulations are conducted against a time-sequenced obstacle avoidance scenario and a swarm obstacle avoidance scenario. The experimental results demonstrate that RA-SHRL generates safe, efficient, and reliable collision avoidance strategies in both time-sequential dynamic obstacles and mixed joint-TSs environments. Additionally, the RA-SHRL is capable of assessing risk and avoiding multiple joint-TSs. Compared with Deep Q-network (DQN) and Constrained Policy Optimization (CPO), the search efficiency of the algorithm proposed in this paper is improved by 40% and 12%, respectively. Moreover, it achieved a 91.3 % success rate of collision avoidance during training. The methodology could also benefit other autonomous systems in dynamic environments.
What problem does this paper attempt to address?