Improved YOLOv3 model with feature map cropping for multi-scale road object detection

Lingzhi Shen,Hongfeng Tao,Yuanzhi Ni,Yue Wang,Stojanovic Vladimir
DOI: https://doi.org/10.1088/1361-6501/acb075
IF: 2.398
2023-01-07
Measurement Science and Technology
Abstract:Road object detection is an essential and imperative step for driving of intelligent vehicles. Generally, road objects, such as vehicles and pedestrians, present the characteristic of multi-scale and uncertain distribution which puts a high demand for detection algorithm. Therefore, this paper proposes a YOLOv3(You Only Look Once v3) based method aiming at enhancing the capability of cross-scale detection and focusing on the valuable area. The proposed method fills an urgent need for multi-scale detection, and its individual components will be useful in road object detection. The K-means-giou algorithm is designed to generate a priori boxes whose shape is close to the real boxes. This greatly reduces the complexity of training, paving the way for fast convergence. Then, a detection branch is added to detect small targets, and a feature map cropping module is introduced into the newly added detection branch to remove the areas with high probability of background targets and easy-to-detect targets, and the cropped areas of the feature map are filled with a value of 0. Besides, channel attention module and spatial attention module are added to strengthen the network's attention to major regions. The experiment results on KITTI dataset show that the proposed method maintains a fast detection speed and increases mAP(mean Average Precision) value by as much as 2.86% compared with YOLOv3-ultra, and especially improves the detection performance for small-scale objects.
engineering, multidisciplinary,instruments & instrumentation
What problem does this paper attempt to address?