Near-infrared maritime target detection based on Swin-Transformer model.

Liang Sui,Wenli Sun,Xu Gao
DOI: https://doi.org/10.1145/3556384.3556417
2022-01-01
Abstract:It is necessary to detect targets such as ships and buoys in marine scenes. But there are problems of light changes affecting the recognition rate and not working at night. It has a great impact on daylight image-based target detection technology in practical applications. In this paper, a deep learning model based on attention mechanism, Swin-Transformer, is applied to the target detection problem of maritime Near-infrared (NIR) images. Unlike convolutional neural networks that extract features by color, texture, geometry, etc., Swin-Transformer obtains target regions to focus on by quick scanning the global image and then assigns greater attention weights. This method is useful for target detection on maritime NIR images with low resolution and poor contrast. We use a well-known marine dataset for training and evaluation - the Singapore Maritime Dataset. We find that even models are trained very well on the visible light dataset, but they are bad to detect on NIR images. This makes it meaningful to use the Swin-Transformer model to train NIR images for target detection as a complement and supplement to the regular daylight image training model. Experiments show that compared with the classical convolutional neural network, the Swin-Transformer model serving as the Faster R-CNN backbone network for maritime target detection achieves an F1-score metric of 0.867, which is a 5.6% improvement compared with the ResNet model previously used as a benchmark. Moreover, the Swin-Transformer model trained with daylight/NIR hybrid images can improve its performance again on the NIR test set images, outperforming the same model trained on the daylight or NIR dataset alone. The F1-score index is 9.5% and 4.0% higher respectively.
What problem does this paper attempt to address?