A Dynamic-Attention On Crowd Region With Physical Optical Flow Features For Crowd Counting

Qian Wang,Wenxi Li,Songjian Chen,Rui Feng
DOI: https://doi.org/10.1109/IJCNN48605.2020.9207409
2020-01-01
Abstract:Crowd counting is widely used in various video surveillance applications. However, most of the existing approaches treat videos as a single frame, which increase redundant information and have low efficiency, due to ignoring the context history information of neighboring frames. In this paper, we propose a novel two-stream dynamic-attention network (DANet) to associate the temporal and spatial information. Specifically, the DANet includes two stages, one of which is to generate the region-attention map and the second is to refine the high-quality density map. In each stage, we develop a hierarchical fusion strategy to guide spatial attention, which can iteratively refine the region of crowds. Besides, the dynamic-attention module guided by the physical optical flow can be dynamically integrated into any network module to optimize the generation of features for improving the effect. Therefore, it can be plugged into many computer vision architectures. Finally, experimental results on three challenging benchmark datasets show that DANet outperforms most of the previous methods. Incorporating such dynamic-attention into a framework could boost the performance of end-to-end CNN-based methods.
What problem does this paper attempt to address?