Small Target-YOLOv5: Enhancing the Algorithm for Small Object Detection in Drone Aerial Imagery Based on YOLOv5

Jiachen Zhou,Taoyong Su,Kewei Li,Jiyang Dai
DOI: https://doi.org/10.3390/s24010134
IF: 3.9
2023-12-26
Sensors
Abstract:Object detection in drone aerial imagery has been a consistent focal point of research. Aerial images present more intricate backgrounds, greater variation in object scale, and a higher occurrence of small objects compared to standard images. Consequently, conventional object detection algorithms are often unsuitable for direct application in drone scenarios. To address these challenges, this study proposes a drone object detection algorithm model based on YOLOv5, named SMT-YOLOv5 (Small Target-YOLOv5). The enhancement strategy involves improving the feature fusion network by incorporating detection layers and implementing a weighted bidirectional feature pyramid network. Additionally, the introduction of the Combine Attention and Receptive Fields Block (CARFB) receptive field feature extraction module and DyHead dynamic target detection head aims to broaden the receptive field, mitigate information loss, and enhance perceptual capabilities in spatial, scale, and task domains. Experimental validation on the VisDrone2021 dataset confirms a significant improvement in the target detection accuracy of SMT-YOLOv5. Each improvement strategy yields effective results, raising the average precision by 12.4 percentage points compared to the original method. Detection improvements for large, medium, and small targets increase by 6.9%, 9.5%, and 7.7%, respectively, compared to the original method. Similarly, applying the same improvement strategies to the low-complexity YOLOv8n results in SMT-YOLOv8n, which is comparable in complexity to SMT-YOLOv5s. The results indicate that, relative to SMT-YOLOv8n, SMT-YOLOv5s achieves a 2.5 percentage point increase in average precision. Furthermore, comparative experiments with other enhancement methods demonstrate the effectiveness of the improvement strategies.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
This paper attempts to address the challenges of small - object detection in UAV aerial images. Specifically, UAV aerial images have more complex backgrounds, greater object - scale variations, and a higher frequency of small - object occurrences. Therefore, traditional object - detection algorithms are usually not suitable for direct application in UAV scenarios. To meet these challenges, this paper proposes a UAV object - detection algorithm model based on YOLOv5, named SMT - YOLOv5 (Small Target - YOLOv5). ### Main problems and solutions 1. **Inaccurate small - target localization**: - Small targets account for a relatively small proportion in the image and are difficult to locate precisely. - **Solution**: A small - target detection layer with 4 - times down - sampling is introduced, which significantly enhances the detection ability for small targets. Combined with a multi - level feature pyramid structure, local and global information are fully integrated, improving the detection accuracy of targets at different scales. 2. **Loss of small - target feature information**: - Due to common down - sampling operations, the key feature information of small targets is easily lost, and it is very complicated to recover these details. - **Solution**: A receiving - field feature extraction module combined with an attention mechanism (CARFB) is introduced. Through spatial and channel - attention mechanisms, the model's ability to represent key information is enhanced, effectively capturing the feature information of small targets. 3. **Small - target class confusion**: - Small targets are easily occluded and may be similar to other object classes in the surrounding environment, leading to confusion and misclassification. - **Solution**: A dynamic head (DyHead) is introduced, integrating various self - attention mechanisms into the output channels, enhancing the network's detection ability for small targets, thereby improving the accuracy of object detection. ### Specific content of algorithm improvement 1. **Feature - fusion network architecture**: - A weighted bidirectional feature pyramid network (BiFPN) is introduced. Through bidirectional cross - connections and fast normalization, features at different levels are effectively integrated, preventing the loss of spatial - position information in shallow - layer feature maps. - A small - target detection layer specifically for detecting extremely small targets is added to the P2 branch, using the rich shape, position, and size information of the shallow - layer convolution layer to improve small - target detection. 2. **Improved feature - fusion path**: - A skip - connection structure is adopted. During the intermediate feature - fusion process, effective fusion of features with different resolutions is maintained, ensuring that the final feature map has rich spatial and semantic information. - A fast - normalization feature - fusion method is used. According to the resolution and contribution weights of different input features, weighted fusion is carried out, enabling the network to learn the importance of each input feature. 3. **Receiving - field feature - extraction module based on attention mechanism**: - Combining channel and spatial - attention mechanisms, the feature expression between different receiving - field channels is enhanced, improving the model's detection ability for multi - scale and dense small targets. ### Experimental verification The experimental results show that, compared with the original method, the average precision of SMT - YOLOv5 on the VisDrone2021 dataset is increased by 12.4 percentage points, and the detection of large, medium, and small targets is increased by 6.9%, 9.5%, and 7.7% respectively. In addition, the low - complexity YOLOv8n version with the same improvement strategy also achieves significant results. In conclusion, through redesigning the network architecture, integrating multi - scale features, introducing attention mechanisms, etc., this paper successfully solves the problem of small - target detection in UAV aerial images and significantly improves the detection performance.