Abstract:Crowd counting is an important research topic in the fields of computer vision and image processing, with monitoring and management of crowded scenes becoming an increasingly prominent issue. Existing methods still suffer from the problem of severe overlap in density maps within dense areas, leading to inadequate counting and localization accuracy. This paper presents innovative research on crowd counting and localization. Firstly, addressing the limitations of density maps in localization performance in existing algorithms, we optimize the generation method of FIDT maps, decoupling the counting and localization tasks. By avoiding the problem of overlap in dense areas, the optimized label maps achieve a good balance between counting accuracy and localization, with MAE and MSE reaching 64.1 and 103.9 in SHHA, and 10.9 and 17.4 in SHHB, respectively.Secondly, to address the scale insensitivity of the encoder and the potential loss of critical features during the encoding process, we propose the Adaptive Feature Fusion Module and the Multi-Scale Global Attention Upsampling Module, constructing the CALNET network. By reducing redundant features inside and outside the separable branch, the model achieves global fusion of shallow features during the decoding process. The F1-m scores obtained on the SHHA and SHHB datasets reach 72.9% and 79.4% respectively, significantly improving the model's performance.Finally, this paper extends the application of crowd counting and localization algorithms to different domains such as citrus orchards, vehicles, and campus crowds. Through experiments, the robustness and transferability of the network are validated, expanding the application areas of crowd counting and localization algorithms and providing a broader space for future research.

SRNet: Scale-Aware Representation Learning Network for Dense Crowd Counting

Scale Pyramid Network For Crowd Counting

Semantic-refined Spatial Pyramid Network for Crowd Counting

Multi-branch Progressive Embedding Network for Crowd Counting

Crowd Counting via Hierarchical Scale Recalibration Network

Reaction Time as a Function of Onset and Offset Stimulation of the Fovea and Periphery

SCLNet: Spatial context learning network for congested crowd counting

LEVERAGE MULTI-SCALE DILATED CONVOLUTIONAL NEURAL NETWORK WITH GLOBAL ATTENTION FEATURE FUSION FOR CROWD COUNTING

An encoder-decoder network for crowd counting based on multi-scale attention mechanism

In Defense of Single-column Networks for Crowd Counting

Scale-Aware Network with Regional and Semantic Attentions for Crowd Counting under Cluttered Background

Redesigning Multi-Scale Neural Network for Crowd Counting

Attention Scaling For Crowd Counting

A Crowd Counting and Localization Network Based on Adaptive Feature Fusion and Multi-Scale Global Attention Up Sampling

SGCNet: Scale-aware and global contextual network for crowd counting

Crowd Counting by Multi-Scale Dilated Convolution Networks

DDRANet: A Dynamic Density-Region-Aware Network for Crowd Counting

Improving Crowd Counting with Scale‐aware Convolutional Neural Network

Bayesian Multi Scale Neural Network for Crowd Counting

A Scale Aggregation and Spatial-Aware Network for Multi-View Crowd Counting

Cascade-guided multi-scale attention network for crowd counting