SEB-YOLO: An Improved YOLOv5 Model for Remote Sensing Small Target Detection

Yan Hui,Shijie You,Xiuhua Hu,Panpan Yang,Jing Zhao
DOI: https://doi.org/10.3390/s24072193
IF: 3.9
2024-03-30
Sensors
Abstract:Due to the limited semantic information extraction with small objects and difficulty in distinguishing similar targets, it brings great challenges to target detection in remote sensing scenarios, which results in poor detection performance. This paper proposes an improved YOLOv5 remote sensing image target detection algorithm, SEB-YOLO (SPD-Conv + ECSPP + Bi-FPN + YOLOv5). Firstly, the space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer module (SPD-Conv) was used to reconstruct the backbone network, which retained the global features and reduced the feature loss. Meanwhile, the pooling module with the attention mechanism of the final layer of the backbone network was designed to help the network better identify and locate the target. Furthermore, a bidirectional feature pyramid network (Bi-FPN) with bilinear interpolation upsampling was added to improve bidirectional cross-scale connection and weighted feature fusion. Finally, the decoupled head is introduced to enhance the model convergence and solve the contradiction between the classification task and the regression task. Experimental results on NWPU VHR-10 and RSOD datasets show that the mAP of the proposed algorithm reaches 93.5% and 93.9%respectively, which is 4.0% and 5.3% higher than that of the original YOLOv5l algorithm. The proposed algorithm achieves better detection results for complex remote sensing images.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The paper aims to address the challenge of small object detection in remote sensing images. The existing YOLOv5 model performs poorly in detecting small objects due to limited semantic information and difficulty in differentiating similar objects. To solve this problem, the paper proposes an improved YOLOv5 model called SEB-YOLO (SPD-Conv + ECSPP + Bi-FPN + YOLOv5). SEB-YOLO optimizes YOLOv5 in the following ways: 1. It reconstructs the backbone network using the Spatial to Depth (SPD) layer and non-stride convolution (Conv) layer modules to preserve global features and reduce feature loss. 2. It designs a pooling module with attention mechanism to better identify and locate targets. 3. It introduces a Bi-directional Feature Pyramid Network (Bi-FPN) to enhance bidirectional cross-scale connections and weighted feature fusion, improving feature integration. 4. It adopts a decoupled head to accelerate model convergence and resolve the contradiction between classification and regression tasks. Experimental results show that SEB-YOLO achieves an average precision (mAP) of 93.5% and 93.9% on the NWPU VHR-10 and RSOD datasets, respectively, which is a 4.0% and 5.3% improvement over the original YOLOv5l algorithm. The algorithm demonstrates superior detection performance for complex remote sensing images.