A Novel Approach in Maritime Data Completion Using Deep Learning and NLP Techniques

Yong Li,Zhishan Wang
DOI: https://doi.org/10.3390/jmse12060868
IF: 2.744
2024-05-24
Journal of Marine Science and Engineering
Abstract:In the extensive monitoring of maritime traffic, maritime management frequently encounters incomplete automatic identification system (AIS) data. This deficiency poses significant challenges to safety management, requiring effective methods to infer corresponding ship information. We tackle this issue using a classification approach. Due to the absence of a fixed road network at sea unlike on land, raw trajectories are difficult to convert and cannot be directly fed into neural networks. We devised a latitude–longitude gridding encoding strategy capable of transforming continuous latitude–longitude data into discrete grid points. Simultaneously, we employed a compression algorithm to further extract significant grid points, thereby shortening the encoding sequence. Utilizing natural language processing techniques, we integrate the Word2vec word embedding approach with our novel biLSTM self-attention chunk-max pooling net (biSAMNet) model, enhancing the classification of vessel trajectories. This method classifies targets into ship types and ship lengths within static information. Employing the Taiwan Strait as a case study and benchmarking against CNN, RNN, and methods based on the attention mechanism, our findings underscore our model's superiority. The biSAMNet achieves an impressive trajectory classification F1 score of 0.94 in the ship category dataset using only five-dimensional word embeddings. Additionally, through ablation experiments, the effectiveness of the Word2vec pre-trained embedding layer is highlighted. This study introduces a novel method for handling ship trajectory data, addressing the challenge of obtaining ship static information when AIS data are unreliable.
engineering, ocean,oceanography, marine
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily aims to address the issue of incomplete Automatic Identification System (AIS) data in maritime traffic monitoring. Specifically: 1. **Background and Challenges**: - In the extensive process of maritime traffic monitoring, incomplete AIS data is frequently encountered. - These data gaps pose significant challenges to safety management, necessitating effective methods to infer vessel information. 2. **Methods and Innovations**: - A classification-based approach is proposed to tackle this issue. - Unlike fixed land routes, maritime paths are not fixed, making it difficult to directly convert raw trajectories and input them into neural networks. - A latitude and longitude grid encoding strategy is designed to convert continuous latitude and longitude data into discrete grid points. - Compression algorithms are used to further extract important grid points, shortening the encoding sequence. - Combining natural language processing techniques, the Word2vec word embedding method is utilized, and a new biLSTM self-attention pooling network (biSAMNet) model is introduced to enhance the classification of vessel trajectories. - This method allows for the classification of targets into different vessel types and static information such as vessel length. 3. **Experimental Validation**: - A case study is conducted using the Taiwan Strait, and comparative experiments are performed with other models (such as CNN, RNN, and attention-based methods). - Experimental results show that the biSAMNet model achieved an F1 score of 0.94 on the vessel category dataset using only five-dimensional word embeddings. - Ablation experiments validated the effectiveness of the Word2vec pre-trained embedding layer. In summary, the paper proposes a novel method to handle vessel trajectory data, addressing the challenge of obtaining static vessel information when AIS data is unreliable.