Single Domain Generalization for Crowd Counting

Zhuoxuan Peng,S.-H. Gary Chan
2024-04-05
Abstract:Due to its promising results, density map regression has been widely employed for image-based crowd counting. The approach, however, often suffers from severe performance degradation when tested on data from unseen scenarios, the so-called "domain shift" problem. To address the problem, we investigate in this work single domain generalization (SDG) for crowd counting. The existing SDG approaches are mainly for image classification and segmentation, and can hardly be extended to our case due to its regression nature and label ambiguity (i.e., ambiguous pixel-level ground truths). We propose MPCount, a novel effective SDG approach even for narrow source distribution. MPCount stores diverse density values for density map regression and reconstructs domain-invariant features by means of only one memory bank, a content error mask and attention consistency loss. By partitioning the image into grids, it employs patch-wise classification as an auxiliary task to mitigate label ambiguity. Through extensive experiments on different datasets, MPCount is shown to significantly improve counting accuracy compared to the state of the art under diverse scenarios unobserved in the training data characterized by narrow source distribution. Code is available at
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily addresses the issue of Single Domain Generalization (SDG) in image crowd counting. Specifically: 1. **Domain Shift Problem**: - Current mainstream crowd density map regression methods experience significant performance degradation when handling unseen data, known as the "domain shift" problem. 2. **Single Source Domain Limitation**: - Existing SDG methods mainly target image classification and segmentation tasks, and due to their regression nature and label ambiguity (i.e., the uncertainty of pixel-level ground truth values), they are difficult to directly apply to crowd counting tasks. 3. **Narrow Distribution Source Domain**: - The paper proposes a new method called MPCount, which can effectively achieve single domain generalization even when the source domain distribution is narrow. ### Method Overview - **Attention Memory Bank (AMB)**: Overcomes the challenge of continuous density values by reconstructing domain-invariant features using only one memory bank. - **Content Error Mask (CEM)**: Excludes feature information that may be related to specific domains, ensuring the consistency of reconstructed features. - **Attention Consistency Loss (ACL)**: Ensures consistent attention distribution in the memory bank for different input features. - **Patch-wise Classification (PC)**: Divides the image into fixed-size patches for classification to alleviate the issue of pixel-level label ambiguity. ### Experimental Results - Experiments on multiple benchmark datasets show that MPCount significantly improves the accuracy of crowd counting compared to existing methods under various settings, especially performing better in narrow distribution source domains.