AI on the Water: Applying DRL to Autonomous Vessel Navigation

Md Shadab Alam,Sanjeev Kumar Ramkumar Sudha,Abhilash Somayajula
2023-10-23
Abstract:Human decision-making errors cause a majority of globally reported marine accidents. As a result, automation in the marine industry has been gaining more attention in recent years. Obstacle avoidance becomes very challenging for an autonomous surface vehicle in an unknown environment. We explore the feasibility of using Deep Q-Learning (DQN), a deep reinforcement learning approach, for controlling an underactuated autonomous surface vehicle to follow a known path while avoiding collisions with static and dynamic obstacles. The ship's motion is described using a three-degree-of-freedom (3-DOF) dynamic model. The KRISO container ship (KCS) is chosen for this study because it is a benchmark hull used in several studies, and its hydrodynamic coefficients are readily available for numerical modelling. This study shows that Deep Reinforcement Learning (DRL) can achieve path following and collision avoidance successfully and can be a potential candidate that may be investigated further to achieve human-level or even better decision-making for autonomous marine vehicles.
Systems and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to use deep reinforcement learning (DRL), especially deep Q - learning (DQN), to control an under - actuated autonomous surface vehicle to navigate along a predetermined path in an unknown environment and avoid collisions with static and dynamic obstacles. ### Specific problem description: 1. **Maritime accidents caused by human decision - making errors**: Most of the globally reported maritime accidents are caused by human decision - making errors, so the application of automation in the marine industry is receiving increasing attention. 2. **Challenges of autonomous surface vehicles**: In an unknown environment, obstacle avoidance for autonomous surface vehicles (ASV) becomes very challenging. Traditional autopilot systems (such as those based on the line - of - sight method and PID controllers) are effective but may perform poorly in complex environments. 3. **Combination of path tracking and obstacle avoidance**: A method that can simultaneously achieve path tracking and obstacle avoidance is required to ensure that the vehicle can complete tasks safely and efficiently in a complex environment. ### Research objectives: - Explore the feasibility of DRL in the navigation of autonomous surface vehicles. - By using the DQN algorithm, enable the vehicle to travel on a known path and successfully avoid static and dynamic obstacles. - Verify whether DRL can reach or exceed human - level decision - making ability, thereby improving the safety and efficiency of autonomous marine vehicles. ### Methods: - Use a 3 - degree - of - freedom (3 - DOF) dynamic model to describe ship motion. - Select the KRISO Container Ship (KCS) as the research object because it is a commonly used benchmark ship type and its hydrodynamic coefficients are easy to obtain. - Design the observation state space, action space and reward structure to adapt to static and dynamic obstacle environments. - Implement the DQN algorithm using the TensorFlow framework and conduct a large number of training and testing. ### Results: The research shows that DRL can successfully achieve path tracking and obstacle avoidance through the DQN algorithm, demonstrating its potential in the navigation of autonomous marine vehicles. Future research will further consider the influence of environmental factors (such as ocean currents and waves) and conduct tests on actual autonomous surface vehicles.