Improvement and Enhancement of YOLOv5 Small Target Recognition Based on Multi-module Optimization

Qingyang Li,Yuchen Li,Hongyi Duan,JiaLiang Kang,Jianan Zhang,Xueqian Gan,Ruotong Xu
2023-10-03
Abstract:In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
This paper primarily addresses the limitations of the YOLOv5s model in small object detection tasks and proposes a series of improvements to enhance model performance. ### Research Background and Problem Definition The paper points out that small object recognition is playing an increasingly important role in industrial production and daily life. For example, in semiconductor manufacturing, early detection of micron or even nanometer-level defects can significantly reduce production costs; in medical image analysis, cell detection is a common application; in the field of security surveillance, tracking small drones is also one of the important application scenarios. Although YOLOv5 has achieved excellent results in object detection, it still has shortcomings in small object detection. ### Solution Overview To overcome the limitations of the YOLOv5s model in small object detection, the paper proposes a small object recognition method based on multi-module optimization. Specific improvements include: 1. **Convolution Module Improvement**: Introduce GhostNet to enhance the model's feature extraction capability, using its method of generating "ghost" features to reduce computational burden while maintaining high performance. 2. **Neck Module Optimization**: Use RepGFPN to optimize the network structure, better integrating multi-scale features. 3. **Attention Mechanism Enhancement**: Combine coordinate attention (CA) and Transformer attention mechanisms to make the model more focused on key areas. 4. **Loss Function Improvement**: Use normalized Gaussian Wasserstein distance (NWD) as the loss function to better distinguish small objects from the background. ### Experimental Results and Conclusion Experimental results show that the improved YOLOv5s model has significantly improved detection accuracy, recall rate, and average precision for small objects across multiple datasets. Especially in practical application scenarios with complex backgrounds and extremely small objects, the improved model demonstrates clear advantages. In summary, this study significantly enhances the performance of the YOLOv5s model in small object detection tasks through a series of targeted improvement strategies, laying a solid foundation for further related research and applications.