SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes

Boshra Khalili,Andrew W.Smyth
2024-08-09
Abstract:Object detection as part of computer vision can be crucial for traffic management, emergency response, autonomous vehicles, and smart cities. Despite significant advances in object detection, detecting small objects in images captured by distant cameras remains challenging due to their size, distance from the camera, varied shapes, and cluttered backgrounds. To address these challenges, we propose Small Object Detection YOLOv8 (SOD-YOLOv8), a novel model specifically designed for scenarios involving numerous small objects. Inspired by Efficient Generalized Feature Pyramid Networks (GFPN), we enhance multi-path fusion within YOLOv8 to integrate features across different levels, preserving details from shallower layers and improving small object detection accuracy. Also, A fourth detection layer is added to leverage high-resolution spatial information effectively. The Efficient Multi-Scale Attention Module (EMA) in the C2f-EMA module enhances feature extraction by redistributing weights and prioritizing relevant features. We introduce Powerful-IoU (PIoU) as a replacement for CIoU, focusing on moderate-quality anchor boxes and adding a penalty based on differences between predicted and ground truth bounding box corners. This approach simplifies calculations, speeds up convergence, and enhances detection accuracy. SOD-YOLOv8 significantly improves small object detection, surpassing widely used models in various metrics, without substantially increasing computational cost or latency compared to YOLOv8s. Specifically, it increases recall from 40.1\% to 43.9\%, precision from 51.2\% to 53.9\%, $\text{mAP}_{0.5}$ from 40.6\% to 45.1\%, and $\text{mAP}_{0.5:0.95}$ from 24\% to 26.6\%. In dynamic real-world traffic scenes, SOD-YOLOv8 demonstrated notable improvements in diverse conditions, proving its reliability and effectiveness in detecting small objects even in challenging environments.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily addresses the challenge of detecting small objects captured by long-distance cameras in traffic scenarios and proposes an improved method. Specifically, the research team developed a new model called SOD-YOLOv8 (Small Object Detection YOLOv8) aimed at enhancing the detection accuracy of small objects such as pedestrians, vehicles, motorcycles, and bicycles. The main contributions and technical features of SOD-YOLOv8 include: 1. **Multi-path Fusion Enhancement**: Inspired by the Efficient Generalized Feature Pyramid Network (GFPN), the researchers enhanced the multi-path fusion mechanism based on YOLOv8 to better integrate features from different levels and retain detailed information from shallow layers, thereby improving the accuracy of small object detection. 2. **Introduction of the Fourth Detection Layer**: To effectively utilize high-resolution spatial information, the model adds an extra detection layer, which helps in more precisely locating small objects. 3. **C2f-EMA Module**: This module combines an efficient multi-scale attention mechanism (EMA) to reallocate feature weights and prioritize relevant features and spatial details in the image channels, further improving the efficiency of feature extraction. 4. **PIoU Loss Function**: As an improved Intersection over Union (IoU) loss function, PIoU focuses on medium-quality anchor boxes and introduces a penalty term based on the differences between the predicted bounding box and the ground truth bounding box corners, simplifying the computation process, speeding up convergence, and improving detection accuracy. Experimental results show that SOD-YOLOv8 significantly improves the detection performance of small objects across various metrics, such as recall rate increasing from 40.1% to 43.9%, precision rate from 51.2% to 53.9%, mAP 0.5 from 40.6% to 45.1%, and mAP 0.5:0.95 from 24% to 26.6%. Additionally, in dynamic real-world traffic scenario tests, the model also demonstrated significant improvements under various conditions, proving its reliability and effectiveness in challenging environments.