Crowd Counting Via Learning Perspective for Multi-Scale Multi-View Web Images

Chong Shang,Haizhou Ai,Yi Yang
DOI: https://doi.org/10.1007/s11704-017-6598-3
IF: 2.6688
2018-01-01
Frontiers of Computer Science
Abstract:Estimating the number of people in Web images still remains a challenging problem owing to the perspective variation, different views, and diverse backgrounds. Existing deep learning models still have difficulties in dealing with scenarios where the size of a person is either extremely large or extremely small. In this paper, we propose a novel perspective-aware architecture to estimate the number of people in a crowd in web images. Specifically, we use a two-stage framework, where we first learn a policy network to infer the perspective of the target scene, which outputs a scale label for the subsequent perspective normalization. Next, given the aligned inputs, we further adjust the scale-specific counting network to regress the final count. Experiments on challenging datasets demonstrate our approach can deal with a large perspective variation and that we have achieved state-of-theart results.
What problem does this paper attempt to address?