Abstract:In a crowd density estimation dataset, the annotation of crowd locations is an extremely laborious task, and they are not taken into the evaluation metrics. In this paper, we aim to reduce the annotation cost of crowd datasets, and propose a crowd density estimation method based on weakly-supervised learning, in the absence of crowd position supervision information, which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised information. For this purpose, we design a new training method, which exploits the correlation between global and local image features by incremental learning to train the network. Specifically, we design a parent-child network (PC-Net) focusing on the global and local image respectively, and propose a linear feature calibration structure to train the PC-Net simultaneously, and the child network learns feature transfer factors and feature bias weights, and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network, to improve the convergence of the network by using local features hidden in the crowd images. In addition, we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels, and design a global-local feature loss function ( ). We combine it with a crowd counting loss ( ) to enhance the sensitivity of the network to crowd features during the training process, which effectively improves the accuracy of crowd density estimation. The experimental results show that the PC-Net significantly reduces the gap between fully-supervised and weakly-supervised crowd density estimation, and outperforms the comparison methods on five datasets of ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50, UCF_QNRF and JHU-CROWD++.

Self-Supervised Learning With Data-Efficient Supervised Fine-Tuning for Crowd Counting

Few-Shot Crowd Counting Via Self-supervised Learning

Semi-Supervised Crowd Counting from Unlabeled Data

A Semi-supervised crowd counting method based on patch crowds statistics

Return of Small-Scale Crowd Counting via Fast and Accurate Semi-Supervised Least Squares Model

Learning from Crowds under Experts' Supervision

Multi-branch Progressive Embedding Network for Crowd Counting

A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds

Hybrid Perturbation Strategy for Semi-Supervised Crowd Counting

Leveraging Self-Supervision for Cross-Domain Crowd Counting

Crowd Counting With Limited Labeling Through Submodular Frame Selection

Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes

Towards using count-level weak supervision for crowd counting

Cross-head Supervision for Crowd Counting with Noisy Annotations.

A Weakly Supervised Hybrid Lightweight Network for Efficient Crowd Counting

Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting

Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling

A Weakly-Supervised Crowd Density Estimation Method Based on Two-Stage Linear Feature Calibration

Deep Rank-Consistent Pyramid Model for Enhanced Crowd Counting

S$^2$FPR: Crowd Counting via Self-Supervised Coarse to Fine Feature Pyramid Ranking

Adaptive Context Learning Network for Crowd Counting.