Abstract:Tracking in gigapixel scenarios holds numerous potential applications in video surveillance and pedestrian analysis. Existing algorithms attempt to perform tracking in crowded scenes by utilizing multiple cameras or group relationships. However, their performance significantly degrades when confronted with complex interaction and occlusion inherent in gigapixel images. In this paper, we introduce DynamicTrack, a dynamic tracking framework designed to address gigapixel tracking challenges in crowded scenes. In particular, we propose a dynamic detector that utilizes contrastive learning to jointly detect the head and body of pedestrians. Building upon this, we design a dynamic association algorithm that effectively utilizes head and body information for matching purposes. Extensive experiments show that our tracker achieves state-of-the-art performance on widely used tracking benchmarks specifically designed for gigapixel crowded scenes.

What problem does this paper attempt to address?

This paper attempts to solve the problem of object tracking in crowd scenes in high - resolution (gigapixel) images. Specifically, the performance of existing multi - object tracking algorithms drops significantly when dealing with high - resolution images with complex interactions and severe occlusions. To address these issues, the authors propose the DynamicTrack framework, aiming to improve the tracking accuracy in crowded scenes by combining head and body information. ### Main Problems 1. **Complex Interactions and Severe Occlusions**: Crowd scenes in high - resolution images usually contain complex interaction behaviors and severe occlusion phenomena, which make traditional tracking algorithms difficult to work effectively. 2. **Limitations of Existing Methods**: - Multi - camera tracking methods are difficult to deal with the rigid segmentation problem of continuous space due to the dispersion of spatial information. - Although methods using group relationships can enhance robustness, there are challenges in capturing group relationships. ### Solutions To overcome the above challenges, the paper proposes the following solutions: 1. **Dynamic Detection Module (Dynamic Detection)**: - The authors design a dynamic detector based on contrastive learning, which can detect the heads and bodies of pedestrians simultaneously. This detector utilizes embedding learning technology and optimizes feature learning through the Associative Embedding Loss (AML). - The formulas are as follows: \[ L_{pull} = \mu (L_{bb}^{pull} + L_{hh}^{pull}) + \beta L_{bh}^{pull} \] \[ L_{push} = \mu (L_{bb}^{push} + L_{hh}^{push}) + \beta L_{bh}^{push} \] \[ \text{Loss AML} = \sigma L_{pull} + \tau L_{push} \] 2. **Dynamic Association Algorithm (Dynamic Association)**: - A dynamic association algorithm is proposed, which can make full use of head and body features for matching. This algorithm takes the body as the core and the head as the auxiliary, combines fine - grained local head features and global body information, thereby improving the robustness of tracking. - The dynamic association algorithm uses cascade matching technology to process the matched heads and bodies, unmatched bodies, and unmatched heads respectively, ensuring that information can be effectively utilized even in occluded environments. ### Experimental Results The experimental results show that DynamicTrack achieves state - of - the - art performance on multiple public datasets (such as MOT20 and PANDA). In particular, when dealing with complex crowded scenes in high - resolution images, DynamicTrack shows significant advantages. In conclusion, this paper successfully solves the problem of object tracking in crowd scenes in high - resolution images by introducing the DynamicTrack framework, especially performing excellently in the case of complex interactions and severe occlusions.

DynamicTrack: Advancing Gigapixel Tracking in Crowded Scenes

A Novel Method For Real-Time Object Detection And Multiple Persons Tracking

APPTracker Plus : Displacement Uncertainty for Occlusion Handling in Low-Frame-Rate Multiple Object Tracking

A novel dynamic model for multiple pedestrians tracking in extremely crowded scenarios

A Multi-Hypothesis Tracker with Enhanced Appearance Model for Generic Crowded Scene.

Crowd Tracking With Dynamic Evolution Of Group Structures

Real Time Crowd Counting with Human Detection and Human Tracking.

Improving Multiple Pedestrian Tracking in Crowded Scenes with Hierarchical Association

Binary Quadratic Programing for Online Tracking of Hundreds of People in Extremely Crowded Scenes

Tracking-by-Counting: Using Network Flows on Crowd Density Maps for Tracking Multiple Targets

MapTrack: Tracking in the Map

An End-to-end Tracking Framework Via Multi-View and Temporal Feature Aggregation

GATrack: Group-Aware Features for Multiple Object Tracking

DenseTrack: Drone-based Crowd Tracking via Density-aware Motion-appearance Synergy

Automation of Pedestrian Tracking in a Crowded Situation

An Improved Compressive Tracker for Multiple Pedestriansin Surveillance Videos

Dynamic Attention Guided Multi-Trajectory Analysis for Single Object Tracking

Effective multiple pedestrian tracking system in video surveillance with monocular stationary camera

Vision Based Multi-pedestrian Tracking Using Adaptive Detection and Clustering.

Consecutive Pedestrian Tracking in Large Scale Space

Real-Time Multiple Pedestrians Tracking in Multi-camera System