Abstract:Crowd counting is a concerned yet challenging task in computer vision. The difficulty is particularly pronounced by scale variations in crowd images. Most state-of-art approaches tackle the multi-scale problem by adopting multicolumn CNN architectures where different columns are designed with different filter sizes to adapt to variable pedestrian/object sizes. However, the structure is bloated and inefficient, and it is infeasible to adopt multiple deep columns due to the huge resource cost. We instead propose a Scale Pyramid Network (SPN) which adopts a shared single deep column structure and extracts multi-scale information in high layers by Scale Pyramid Module. In Scale Pyramid Module, we specifically employ different rates of dilated convolutions in parallel instead of traditional convolutions with different sizes. Compared to other methods of coping with scale issues, our single column structure with Scale Pyramid Module can get more accurate estimation with simpler structure and less complexity of training. And our Scale Pyramid Module can be easily applied to a deep network. Experimental results on four datasets show that our method achieves state-of-the-art performance. On Shanghai-Tech Part A dataset which is challenging for its highly congested scenes and scale variation, we achieve 9.5% lower MAE and 13.5% lower MSE than the previous state-of-the-art method. We also extend our model on TRANCOS vehicle counting dataset and significantly achieve 5.9% lower GAME(0), 10% lower GAME(1), 24.5% lower GAME(2), 38.7% lower GAME(3) than the previous state-of-the-art method. The experimental results prove the robustness of our model for crowd counting, especially with scale variations.

Crowd Distribution Estimation with Multi-scale Recursive Convolutional Neural Network.

Scale Pyramid Network For Crowd Counting

Relevant Region Prediction for Crowd Counting

Multi-branch Progressive Embedding Network for Crowd Counting

Single-Image Crowd Counting Via Multi-Column Convolutional Neural Network

LEVERAGE MULTI-SCALE DILATED CONVOLUTIONAL NEURAL NETWORK WITH GLOBAL ATTENTION FEATURE FUSION FOR CROWD COUNTING

Redesigning Multi-Scale Neural Network for Crowd Counting

Bayesian Multi Scale Neural Network for Crowd Counting

Learning Multi-Level Density Maps for Crowd Counting.

Concise Convolutional Neural Network for Crowd Counting

Image Crowd Counting Using Convolutional Neural Network and Markov Random Field

Crowd Counting Method Based on Convolutional Neural Network with Global Density Feature

Multi-scale features fused network with multi-level supervised path for crowd counting

Crowd density estimation based on multi scale features fusion network with reverse attention mechanism

Density-Aware Multi-Task Learning for Crowd Counting

Crowd Counting Based on Multiresolution Density Map and Parallel Dilated Convolution

Improving Dense Crowd Counting Convolutional Neural Networks Using Inverse K-Nearest Neighbor Maps and Multiscale Upsampling.

An encoder-decoder network for crowd counting based on multi-scale attention mechanism

Reaction Time as a Function of Onset and Offset Stimulation of the Fovea and Periphery

Crowd Counting by Multi-Scale Dilated Convolution Networks

Revisiting Perspective Information For Efficient Crowd Counting