An Algorithm of Complete Coverage Path Planning for Deep‐Sea Mining Vehicle Clusters Based on Reinforcement Learning

Bowen Xing,Xiao Wang,Zhenchong Liu
DOI: https://doi.org/10.1002/adts.202300970
2024-01-16
Advanced Theory and Simulations
Abstract:This paper combines a deep Q network, sample priority, long short‐term memory, and pre‐processing mechanisms to achieve the design of a complete coverage path planning algorithm for deep‐sea mining vehicle clusters. A map‐ fusion mechanism and a state matrix preprocessing method are proposed. Constraints based on distance variance are also designed to avoid the problem of hose entanglement. This paper proposes a deep reinforcement learning algorithm to achieve complete coverage path planning for deep‐sea mining vehicle clusters. First, the mining vehicles and the deep‐sea mining environment are modeled. Then, this paper implements a series of algorithm designs and optimizations based on Deep Q Networks (DQN). The map fusion mechanism can integrate the grid matrix data from multiple mining vehicles to get the state matrix of the complete environment. In this paper, a preprocessing method for the state matrix is also designed to provide suitable training data for the neural network. The reward function and action selection mechanism of the algorithm are also optimized according to the requirements of cluster cooperative operation. Furthermore, the algorithm uses distance constraints to prevent the entanglement of underwater hoses. To improve the training efficiency of the neural network, the algorithm filters and extracts training samples for training through the sample quality score. Considering the requirement of cluster complete coverage mission, this paper introduces Long Short‐Term Memory (LSTM) based on the neural network to achieve a better training effect. After completing the above optimization and design, the algorithm proposed in this paper is verified through simulation experiments.
multidisciplinary sciences
What problem does this paper attempt to address?