SE-YOLOv4: shuffle expansion YOLOv4 for pedestrian detection based on PixelShuffle

Mingsheng Liu,Liang Wan,Bo Wang,Tingting Wang
DOI: https://doi.org/10.1007/s10489-023-04456-0
IF: 5.3
2023-01-25
Applied Intelligence
Abstract:In pedestrian detection, the upsampling operation of YOLOv4 during feature aggregation affects the integrity of feature information for small-scale and occluded targets. To address this issue, we propose a pedestrian detection model named Shuffle Expansion YOLOv4 (SE-YOLOv4) composed of a path aggregation network based on PixelShuffle (Shuffle-PANet) and an efficient pyramid atrous convolutional block attention module (EPA-CBAM), to improve the detection performance of small-scale and occluded pedestrian targets. First, we propose a feature aggregation network Shuffle-PANet based on PixelShuffle to maintain the feature information integrity of small-scale and occluded targets by expanding high-resolution feature maps through convolutions and interchannel periodic shuffling instead of linear interpolation-based upsampling. Then, we propose EPA-CBAM, whose channel attention module (EPA-CAM) can build a pyramid structure and obtain fine-grained multiscale spatial information in different channels by dilated convolutions of corresponding sizes. The results show that the miss rate of SE-YOLOv4 decreased by 3.54% compared with that of the vanilla YOLOv4 on the CityPersons dataset. Comparison experiment results on four challenging pedestrian detection datasets show that our method achieves very competitive performance and maintains a reasonable balance between accuracy and speed.
computer science, artificial intelligence
What problem does this paper attempt to address?