Cascade Transformer Decoder Based Occluded Pedestrian Detection With Dynamic Deformable Convolution and Gaussian Projection Channel Attention Mechanism
Chunjie Ma,Li Zhuo,Jiafeng Li,Yutong Zhang,Jing Zhang
DOI: https://doi.org/10.1109/tmm.2023.3251100
IF: 7.3
2023-05-09
IEEE Transactions on Multimedia
Abstract:Occluded pedestrian detection is very challenging in computer vision, because the pedestrians are frequently occluded by various obstacles or persons, especially in crowded scenarios. In this article, an occluded pedestrian detection method is proposed under a basic DEtection TRansformer (DETR) framework. Firstly, Dynamic Deformable Convolution (DyDC) and Gaussian Projection Channel Attention (GPCA) mechanism are proposed and embedded into the low layer and high layer of ResNet50 respectively, to improve the representation capability of features. Secondly, Cascade Transformer Decoder (CTD) is proposed, which aims to generate high-score queries, avoiding the influence of low-score queries in the decoder stage, further improving the detection accuracy. The proposed method is verified on three challenging datasets, namely CrowdHuman, WiderPerson, and TJU-DHD-pedestrian. The experimental results show that, compared with the state-of-the-art methods, it can obtain a superior detection performance.
computer science, information systems,telecommunications, software engineering