DCGSA: A global self-attention network with dilated convolution for crowd density map generating

Liping Zhu,Chengyang Li,Bing Wang,Kun Yuan,Zhongguo Yang
DOI: https://doi.org/10.1016/j.neucom.2019.10.081
IF: 6
2020-01-01
Neurocomputing
Abstract:Due to non-uniform density and variations in scale and perspective, estimating crowd count in crowded scenes in different degree is an extremely challenging task. The deep learning models mostly use pooling operation so that the density map of original resolution is obtained through the last upsampling. This paper aims to solve the problem of losing local spatial information by pooling in density map estimation. Therefore, we propose a dilated convolution neural network with global self-attention, named DCGSA. Especially, we introduce a Global Self-Attention module (GSA) to provide global context as guidance of low-level features to select person location details and a Pyramid Dilated Convolution module (PDC) that extracts channel-wise and pixel-wise features more precisely. Extensive experiments on several crowd datasets show that our method achieves lower crowd counting error and better density maps compared to the recent state-of-the-art methods. In particular, our method also performs well on the sparse dataset UCSD. (C) 2019 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?