Progressive Multi-resolution Loss for Crowd Counting

Ziheng Yan,Yuankai Qi,Guorong Li,Xinyan Liu,Weigang Zhang,Ming-Hsuan Yang,Qingming Huang
DOI: https://doi.org/10.1109/tcsvt.2023.3317518
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Crowd counting is usually handled in a density map regression fashion, which is supervised via an L2 loss between the predicted density map and ground truth. To effectively regulate models, various improved L2 loss functions have been developed to find a better correspondence between predicted density and annotation positions. In this paper, we propose to predict the density map at one resolution but measure its quality via a derived log-formed loss at multiple resolutions. Unlike existing methods that assume density maps at different resolutions are independent, our loss is obtained by modeling the likelihood function inspired by the relationship of density maps across multi-resolutions. We find that the traditional single-resolution L2 loss is a particular case of our derived log-likelihood. We mathematically prove it is superior to a single-resolution L2 loss. Without bells and whistles, the proposed loss substantially improves several baselines and performs favorably compared to state-of-the-art methods on five crowd counting datasets: NWPU-Crowd, ShanghaiTech A & B, UCF-QNRF, and JHU-Crowd++. The source code and trained models are released at https://github.com/streamer-AP/PML_Loss.git.
engineering, electrical & electronic
What problem does this paper attempt to address?