MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention

Tingting Wang,Liang Wan,Lu Tang,Mingsheng Liu
DOI: https://doi.org/10.1007/s10489-021-03061-3
IF: 5.3
2022-03-14
Applied Intelligence
Abstract:To solve the problem of numerous deep convolutions in YOLOv4, which generates many redundant background features so that it cannot focus on pedestrians at a specific scale, we propose a method named MGA-YOLOv4 (Mask-Guided Attention YOLOv4) that can dynamically select the most crucial features from a cluttered background. First, we design a semantic segmentation encode-decode network to generate a fine-grained pixel-level mask that is used to serve as a weakly supervised signal in each detection branch. Second, we build a mask-guided attention module by producing attention weights of the channel dimension and space dimension and then encode them into the mask to highlight pedestrians of a specific scale and avoid background interference. To prove the effectiveness of MGA, we demonstrate the network attention map and design ablation experiments. The results show that the miss rate of the proposed method combined with the channel concatenate space decreased by 1.82% compared with the original YOLOv4. Comparison experiment results on five challenging pedestrian detection datasets show that our method achieves very competitive performance with the state-of-the-art methods and reaches a favourable trade-off between speed and accuracy.
computer science, artificial intelligence
What problem does this paper attempt to address?