Abstract:Abstract The challenge of multi-object tracking stands as a fundamental focus in computer vision research, finding widespread applications in areas such as public safety, transportation, autonomous vehicles, robotics, and other domains involving artificial intelligence. Given the intricate nature of natural scenes, the occurrence of object occlusion and semi-occlusion is commonplace in basic tracking tasks. These factors often result in challenges such as ID switching, object loss, detection errors, and misaligned bounding boxes, thereby significantly impacting the precision of multi-object tracking.This paper aims to address the aforementioned issues and proposes a novel multi-object tracker, incorporating Relative location mapping (RLM) and Target region density (TRD) modeling. The new tracker is more sensitive to differences in the spatial relationships between targets, allowing it to dynamically introduce low-scoring detection boxes into different regions based on the density of target regions in the image. This improves the accuracy of target tracking while avoiding the consumption of a significant amount of computational resources.Our research results indicate that when applying this method to state-of-the-art multi-object tracking approaches, the proposed model achieves improvements of 0.4 to 0.8 points in the HOTA and IDF1 metrics on the MOT17 and MOT20 datasets. This demonstrates the effectiveness of the proposed method in enhancing multi-object tracking performance.

What problem does this paper attempt to address?

This paper attempts to address several key issues in multi-object tracking, mainly including: 1. **Target Occlusion**: In natural scenes, occlusion and partial occlusion between targets are very common, which can lead to ID switches, target loss, detection errors, and bounding box misalignment in multi-object tracking tasks, thereby severely affecting the accuracy of multi-object tracking. 2. **Restoration of Target Positional Relationships**: In multi-object tracking scenarios, the relative positions of targets captured by the camera in the image do not reflect the actual spatial distance relationships. Different shooting angles can cause inconsistent distances between near and far targets, and even visual differences in motion speed. 3. **Rational Allocation of Computational Resources**: Multi-object tracking systems need to process large-scale data streams in real-time, adapt to changing environments, and provide instant decision support. Therefore, how to allocate computational resources reasonably to improve tracking efficiency is an important issue. To address these issues, the paper proposes a new multi-object tracking method that combines **Relative Location Mapping (RLM)** and **Target Region Density (TRD) modeling**. The specific contributions are as follows: 1. **Relative Location Mapping Model**: By projecting the positions of targets in video images onto a virtual plane, this model can better restore the actual positional relationships between targets, reducing ID switch errors caused by factors such as target size, posture, and occlusion. 2. **Target Region Density Model**: This model quantifies the target density in different regions of the video image, allowing the tracker to adaptively calculate the threshold for low-score detection boxes based on regional density. This reduces the interference of low-score detection boxes in low-density target areas while lowering computational costs. 3. **Standardized Bounding Box Method**: By generating standardized target bounding boxes in the mapped image, this method uses a unified approach to mark the actual positions of targets on the virtual plane for the first time. This method can reduce occlusion phenomena caused by inconsistent target bounding box sizes, enhance the tracker's perception of target motion speed and positional changes, thereby improving tracking accuracy. Experimental results show that these improved methods effectively reduce the probability of identity switches and target loss in crowded scenes, providing a new effective approach to solving occlusion problems and enhancing the performance and robustness of multi-object tracking systems. These contributions are of great significance for complex multi-object tracking scenarios in practical applications (such as surveillance, autonomous driving, and robot navigation).

Rlm-tracking: online multi-pedestrian tracking supported by relative location mapping

Object-Level Pseudo-3D Lifting for Distance-Aware Tracking

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

MAT: Motion-Aware Multi-Object Tracking

Real-time Multi-Object Tracking Based on Bi-directional Matching

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Multi-features Guided Robust Visual Tracking.

Real-Time Online Multi-Object Tracking

Multi-object Detection, Tracking and Prediction in Rugged Dynamic Environments

MapTrack: Tracking in the Map

Multi-object Tracking by Expanding Long-Tracklets

Multiple object tracking with appearance feature prediction and similarity fusion

PANet: An end-to-end Network based on Relative Motion for Online Multi-object Tracking

MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving

Improved multi object tracking with locality sensitive hashing

Distractor-aware discrimination learning for online multiple object tracking

Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking

Multi-target tracking based on appearance features and similarity fusion

Online multi-object tracking with pedestrian re-identification and occlusion processing

MM-Tracker: Visual Tracking with A Multi-Task Model Integrating Detection and Differentiating Feature Extraction

3D Multi-Object Tracking in Point Clouds Based on Prediction Confidence-Guided Data Association