A study on a target detection model for autonomous driving tasks

Hao Chen,Byung‐Won Min,Haifei Zhang
DOI: https://doi.org/10.1049/ipr2.13185
IF: 2.3
2024-07-15
IET Image Processing
Abstract:The study introduces a novel target detection model tailored for autonomous driving tasks, demonstrating accuracy improvements on the SODA10M and BDD100K datasets while examining the impact of Group Shuffle Convolution on inference speed. By replacing the original large target detection head with a smaller one, the model maintains details of small targets and incorporates adaptive feature fusion to effectively capture information across various scales. Additionally, the study proposes enhanced attention mechanisms and a spatial feature redundancy path to prevent the loss of vital target features, thereby enhancing the overall performance of the model for autonomous driving applications. Target detection in autonomous driving tasks presents a complex and critical challenge due to the diversity of targets and the intricacy of the environment. To address this issue, this paper proposes an enhanced YOLOv8 model. Firstly, the original large target detection head is removed and replaced with a detection head tailored for small targets and high‐level semantic details. Secondly, an adaptive feature fusion method is proposed, where input feature maps are processed using dilated convolutions with different dilation rates, followed by adaptive feature fusion to generate adaptive weights. Finally, an improved attention mechanism is incorporated to enhance the model's focus on target regions. Additionally, the impact of Group Shuffle Convolution (GSConv) on the model's detection speed is investigated. Validated on two public datasets, the model achieves a mean Average Precision (mAP) of 53.7% and 53.5%. Although introducing GSConv results in a slight decrease in mAP, it significantly improves frames per second. These findings underscore the effectiveness of the proposed model in autonomous driving tasks.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?