Unmanned Surface Vessel Visual Object Detection under All-Weather Conditions with Optimized Feature Fusion Network in YOLOv4

Sun Xiaoqiang,Liu Tao,Yu Xiuping,Pang Bo
DOI: https://doi.org/10.1007/s10846-021-01499-8
2021-01-01
Journal of Intelligent & Robotic Systems
Abstract:Object detection based on visual images provides significant technical support to realize autonomous environment perception of unmanned surface vehicles (USVs), yet difficulties exist for object detection on the sea surface, such as the constantly-changing weather conditions, considerable changes in the scale of objects, and severe object shaking. In this paper, an USV object detection algorithm is proposed based on an optimized feature fusion network. YOLOv4 was selected as the baseline, which is currently one of the object detection models with the best trade-offs between speed and accuracy. In the proposed weighted Cross Stage Partial Path Aggregation Network (wCSPPAN) structure, the concept of the Cross Stage Partial Network (CSPNet) was applied to the feature fusion network, so as to significantly reduce the amount of computation. In addition, weights that can be learned were introduced to learn input features of different resolutions in a more reasonable way. Finally, to obtain cross-space and cross-scale feature interactions, attempts were made to integrate the universal visual component of the Feature Pyramid Transformer (FPT) with the transformer mechanism in the YOLOv4 model. The FPT was used to transform the feature pyramid into another feature pyramid of the same size with richer contextual information for each level of feature map. We compare the performance with other advanced object detection algorithms. The experiments conducted on the dataset of sea surface buoys, which was collected and produced independently for the present study, and the open source Singapore Maritime Dataset verify that the proposed method achieved good results in the detection of sea objects of different scales under various weather conditions. In particular, the detection of long-distance small objects under foggy conditions with low visibility and extreme conditions was improved, and real-time sea surface object detection was achieved with an inference speed of up to 36FPS on a single RTX 2080Ti. Further, the detection results on the COCO dataset and KITTI dataset verify the excellent generalization ability of our proposed object detection model.
What problem does this paper attempt to address?