Vehicle–Pedestrian Detection Method Based on Improved YOLOv8

Bo Wang,Yuan-Yuan Li,Weijie Xu,Huawei Wang,Li Hu
DOI: https://doi.org/10.3390/electronics13112149
IF: 2.9
2024-05-31
Electronics
Abstract:The YOLO series of target detection networks are widely used in transportation targets due to the advantages of high detection accuracy and good real-time performance. However, it also has some limitations, such as poor detection in scenes with large-scale variations, a large number of computational resources being consumed, and occupation of more storage space. To address these issues, this study uses the YOLOv8n model as the benchmark and makes the following four improvements: (1) embedding the BiFormer attention mechanism in the Neck layer to capture the associations and dependencies between the features more efficiently; (2) adding a 160 × 160 small-scale target detection header in the Head layer of the network to enhance the pedestrian and motorcycle detection capability; (3) adopting a weighted bidirectional feature pyramid structure to enhance the feature fusion capability of the network; and (4) making WIoUv3 as a loss function to enhance the focus on common quality anchor frames. Based on the improvement strategies, the evaluation metrics of the model have improved significantly. Compared to the original YOLOv8n, the mAP reaches 95.9%, representing an increase of 4.7 percentage points, and the mAP50:95 reaches 74.5%, reflecting an improvement of 6.2 percentage points.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in complex traffic scenarios, there are great challenges in vehicle and pedestrian detection, especially in the cases of large - scale changes of targets, target overlap and background occlusion. The existing target detection algorithms perform poorly in these aspects. Specifically, although the YOLO series of target detection networks have high detection accuracy and good real - time performance, they perform poorly when dealing with large - scale changing scenarios, requiring a large amount of computing resources and occupying more storage space. Therefore, this research aims to improve the detection ability of vehicle and pedestrian targets by improving the YOLOv8 model, especially in small - target detection. To achieve this goal, the researchers proposed the following four improvement measures: 1. **Embed BiFormer attention mechanism**: Embed the BiFormer attention mechanism in the Neck layer to more effectively capture the associations and dependencies between features. 2. **Add small - scale target detection head**: Add a 160×160 small - scale target detection head in the Head layer to enhance the detection ability of pedestrians and motorcycles. 3. **Adopt weighted bidirectional feature pyramid structure**: Introduce a weighted bidirectional feature pyramid structure to enhance the feature fusion ability of the network. 4. **Use WIoUv3 as loss function**: Use WIoUv3 as loss function to enhance the focus on high - quality anchor boxes. Through these improvements, the evaluation indicators of the model have been significantly improved. Compared with the original YOLOv8n, mAP has reached 95.9%, an increase of 4.7 percentage points; mAP 50:95 has reached 74.5%, an increase of 6.2 percentage points. These improvements not only improve the detection accuracy, but also enhance the robustness of the model in complex traffic environments.