A knowledge-free path planning approach for smart ships based on reinforcement learning

Chen Chen,Xian-Qiao Chen,Feng Ma,Xiao-Jun Zeng,Jin Wang
DOI: https://doi.org/10.1016/j.oceaneng.2019.106299
IF: 5
2019-10-01
Ocean Engineering
Abstract:The autonomous navigation of smart ships needs to meet their huge inertia and obey existing complex rules. A smart ship has to realise autonomous driving instead of manual operation, which consists of path planning and controlling. Toward to this goal, this research proposes a path planning and manipulating approach based on Q-learning, which can drive a cargo ship by itself without requiring any input from human experiences. At the very beginning, a ship is modelled with the Nomoto model in a simulation waterway. Then, distances, obstacles and prohibited areas are regularized as rewards or punishments, which are used to judge the performance, or manipulation decisions of the ship. Subsequently, Q-learning is introduced to learn the action–reward model and the learning outcome is used to manipulate the ship's movement. By chasing higher reward values, the ship can find an appropriate path or navigation strategies by itself. After a sufficient number of rounds of training, a convincing path and manipulating strategies will likely be produced. By comparing the proposed approach with the existing methods, it is shown that this approach is more effective in self-learning and continuous optimisation, and therefore closer to human manoeuvring.
engineering, civil, ocean, marine,oceanography
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the path planning problem in intelligent ship autonomous navigation. Specifically, the paper focuses on how to use reinforcement learning methods (especially Q - learning) to achieve the autonomous driving of cargo ships without the input of human experience. This includes two parts: path planning and control, aiming to meet the requirements of the huge inertia of cargo ships and complex navigation rules, while overcoming challenges such as dynamic environments, insufficient power and perceptual uncertainty. Traditional path planning methods such as A* algorithm, artificial potential field method (APF), rapidly - exploring random tree (RRT), etc., although perform well in land robots, are often not suitable for the navigation requirements considering the dynamic characteristics of cargo ships. Therefore, the paper proposes a path planning method based on Q - learning. Through a large number of trainings in the simulated environment, the intelligent ship can autonomously find a suitable path or navigation strategy, thereby achieving autonomous navigation closer to human - operated intelligence.