A multi-scale fusion and dual attention network for crowd counting

De Zhang,Yiting Wang,Xiaoping Zhou,Liangliang Su
DOI: https://doi.org/10.1007/s11042-024-19326-1
IF: 2.577
2024-05-22
Multimedia Tools and Applications
Abstract:Recently, crowd counting has been continuously a hot topic in the field of computer vision and achieved significant progress using deep learning techniques. Nevertheless, it is still a challenging task in practical applications due to scale variation of heads and complex background noise. In order to overcome these problems, we present a novel Multi-Scale features fused network combined with Dual Attention mechanism, named as MSDANet, for more accurate crowd counting. In MSDANet, we utilize the first 10 layers of VGG-16 model as the backbone. Then, a middle-end model consisting of two parallel branches is followed. In one branch, we stack two multi-scale information fusion (MSIF) blocks to handle the scale variations. MSIF block is developed with a cross-layer information passing mechanism to achieve the sufficient features fusion and provide rich contextual information. In the other branch, a dual attention (DA) block is designed with a position attention module (PAM) and a channel attention module (CAM) to remove complex background noise in crowd scenes. PAM can adaptively assign importance weights to each point at the pixel level and enhance semantic similarity dependencies among different sized head regions, whereas CAM focuses on enhancing the global scope relations and accounting the interdependence among all channel-wise nodes, which is complementary to PAM. In the end, dilation convolution is introduced to generate a high-quality crowd density map. Extensive experiments on five publicly shared datasets demonstrate that the proposed MSDANet achieves state-of-the-art counting performance and high robustness.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?