LEVERAGE MULTI-SCALE DILATED CONVOLUTIONAL NEURAL NETWORK WITH GLOBAL ATTENTION FEATURE FUSION FOR CROWD COUNTING

Meilei Lv,Kuncai Zhang,Xiaoyun Zheng,W. E. Yang,Zhe-Ming Lu
DOI: https://doi.org/10.24507/ijicic.18.04.1147
2022-01-01
Abstract:Crowd counting in various complex scenes is a challenging problem which has attracted much attention in both academic circles and industries due to its applications in public safety. Recently, state-of-the-art methods for counting people in crowded scenes rely on deep Convolutional Neural Networks (CNNs) to estimate crowd density. In this paper, we propose a novel network for accurate and efficient crowd counting called Multi-Scale Dilated Convolutional Neural Network (MSDNet) to provide an effective deep learning method that can perform accurate estimation of crowd density. The proposed MSDNet uses multi-scale dilated kernels to aggregate features across different scales. In addition, a Global Attention Fusion Module (GAFM) is designed to merge features from different levels in a global attention mechanism. Extensive experiments on four benchmark crowd counting datasets (ShanghaiTech, UCF_CC_50, WorldExpo'10, and UCSD) demonstrate the superior performance of the proposed method over other competitive approaches.
What problem does this paper attempt to address?