Real-time Ship Recognition and Georeferencing for the Improvement of Maritime Situational Awareness

Borja Carrillo Perez
DOI: https://doi.org/10.26092/elib/3265
2024-10-07
Abstract:In an era where maritime infrastructures are crucial, advanced situational awareness solutions are increasingly important. The use of optical camera systems can allow real-time usage of maritime footage. This thesis presents an investigation into leveraging deep learning and computer vision to advance real-time ship recognition and georeferencing for the improvement of maritime situational awareness. A novel dataset, ShipSG, is introduced, containing 3,505 images and 11,625 ship masks with corresponding class and geographic position. After an exploration of state-of-the-art, a custom real-time segmentation architecture, ScatYOLOv8+CBAM, is designed for the NVIDIA Jetson AGX Xavier embedded system. This architecture adds the 2D scattering transform and attention mechanisms to YOLOv8, achieving an mAP of 75.46% and an 25.3 ms per frame, outperforming state-of-the-art methods by over 5%. To improve small and distant ship recognition in high-resolution images on embedded systems, an enhanced slicing mechanism is introduced, improving mAP by 8% to 11%. Additionally, a georeferencing method is proposed, achieving positioning errors of 18 m for ships up to 400 m away and 44 m for ships between 400 m and 1200 m. The findings are also applied in real-world scenarios, such as the detection of abnormal ship behaviour, camera integrity assessment and 3D reconstruction. The approach of this thesis outperforms existing methods and provides a framework for integrating recognized and georeferenced ships into real-time systems, enhancing operational effectiveness and decision-making for maritime stakeholders. This thesis contributes to the maritime computer vision field by establishing a benchmark for ship segmentation and georeferencing research, demonstrating the viability of deep-learning-based recognition and georeferencing methods for real-time maritime monitoring.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to enhance maritime situational awareness, especially in terms of real - time ship identification and geolocation. Specifically, the paper mainly focuses on the following issues: 1. **Limitations of Existing Ship Monitoring Systems**: - The Automatic Identification System (AIS) has problems of update delay and vulnerability to cyber - attacks. - Satellite images and radar systems face challenges of data acquisition and processing delay in real - time applications. 2. **Advantages and Challenges of Optical Camera Systems**: - Optical camera systems can provide real - time ship monitoring, but a large number of video streams pose challenges to operators. - In order to improve operational efficiency, it is necessary to automatically identify and geolocate ships through computer vision and deep - learning technologies. 3. **Identification Accuracy of Small and Long - Distance Ships**: - A key challenge in maritime monitoring is to identify small and long - distance ships, which is crucial for safety and security and helps in early threat detection and accident prevention. 4. **Application of Embedded Devices**: - Using embedded devices equipped with GPUs for local data processing can achieve real - time ship identification and geolocation with low latency, low energy consumption, and high security in the edge - computing environment. ### Main Contributions of the Paper To solve the above problems, the paper proposes and validates the following methods and techniques: - **ShipSG Dataset**: A new dataset for ship identification and geolocation has been created, containing 3,505 images and 11,625 ship masks, which is used for developing and validating identification and geolocation methods. - **ScatYOLOv8+CBAM Architecture**: A new real - time segmentation architecture has been proposed, which combines 2D scattering transform and attention mechanism to optimize the YOLOv8 model, and is especially suitable for the NVIDIA Jetson AGX Xavier embedded system. This architecture performs excellently in terms of inference speed and accuracy, achieving 75.46% mAP and an inference time of 25.3 milliseconds per frame. - **Improved Small - Ship Segmentation Mechanism**: An enhanced slicing mechanism has been introduced to achieve batch inference and prediction merging, which significantly improves the mAP of small - ship segmentation (an increase of 8% - 11%). - **Geolocation Method**: A geolocation method based on a single image has been proposed, which uses homomorphic transformation to automatically calculate the geolocation pixels of ships without prior knowledge of camera parameters. This method achieves positioning errors of 18 ± 10 meters and 44 ± 27 meters inside and outside the port respectively. Through these techniques and methods, the paper shows how to significantly enhance the maritime situational awareness ability by using deep - learning and computer - vision technologies, providing effective solutions for practical application scenarios.