Rlm-tracking: online multi-pedestrian tracking supported by relative location mapping

Kai Ren,Chuanping Hu,Hao Xi
DOI: https://doi.org/10.1007/s13042-023-02070-7
2024-01-18
International Journal of Machine Learning and Cybernetics
Abstract:Abstract The challenge of multi-object tracking stands as a fundamental focus in computer vision research, finding widespread applications in areas such as public safety, transportation, autonomous vehicles, robotics, and other domains involving artificial intelligence. Given the intricate nature of natural scenes, the occurrence of object occlusion and semi-occlusion is commonplace in basic tracking tasks. These factors often result in challenges such as ID switching, object loss, detection errors, and misaligned bounding boxes, thereby significantly impacting the precision of multi-object tracking.This paper aims to address the aforementioned issues and proposes a novel multi-object tracker, incorporating Relative location mapping (RLM) and Target region density (TRD) modeling. The new tracker is more sensitive to differences in the spatial relationships between targets, allowing it to dynamically introduce low-scoring detection boxes into different regions based on the density of target regions in the image. This improves the accuracy of target tracking while avoiding the consumption of a significant amount of computational resources.Our research results indicate that when applying this method to state-of-the-art multi-object tracking approaches, the proposed model achieves improvements of 0.4 to 0.8 points in the HOTA and IDF1 metrics on the MOT17 and MOT20 datasets. This demonstrates the effectiveness of the proposed method in enhancing multi-object tracking performance.
computer science, artificial intelligence
What problem does this paper attempt to address?
This paper attempts to address several key issues in multi-object tracking, mainly including: 1. **Target Occlusion**: In natural scenes, occlusion and partial occlusion between targets are very common, which can lead to ID switches, target loss, detection errors, and bounding box misalignment in multi-object tracking tasks, thereby severely affecting the accuracy of multi-object tracking. 2. **Restoration of Target Positional Relationships**: In multi-object tracking scenarios, the relative positions of targets captured by the camera in the image do not reflect the actual spatial distance relationships. Different shooting angles can cause inconsistent distances between near and far targets, and even visual differences in motion speed. 3. **Rational Allocation of Computational Resources**: Multi-object tracking systems need to process large-scale data streams in real-time, adapt to changing environments, and provide instant decision support. Therefore, how to allocate computational resources reasonably to improve tracking efficiency is an important issue. To address these issues, the paper proposes a new multi-object tracking method that combines **Relative Location Mapping (RLM)** and **Target Region Density (TRD) modeling**. The specific contributions are as follows: 1. **Relative Location Mapping Model**: By projecting the positions of targets in video images onto a virtual plane, this model can better restore the actual positional relationships between targets, reducing ID switch errors caused by factors such as target size, posture, and occlusion. 2. **Target Region Density Model**: This model quantifies the target density in different regions of the video image, allowing the tracker to adaptively calculate the threshold for low-score detection boxes based on regional density. This reduces the interference of low-score detection boxes in low-density target areas while lowering computational costs. 3. **Standardized Bounding Box Method**: By generating standardized target bounding boxes in the mapped image, this method uses a unified approach to mark the actual positions of targets on the virtual plane for the first time. This method can reduce occlusion phenomena caused by inconsistent target bounding box sizes, enhance the tracker's perception of target motion speed and positional changes, thereby improving tracking accuracy. Experimental results show that these improved methods effectively reduce the probability of identity switches and target loss in crowded scenes, providing a new effective approach to solving occlusion problems and enhancing the performance and robustness of multi-object tracking systems. These contributions are of great significance for complex multi-object tracking scenarios in practical applications (such as surveillance, autonomous driving, and robot navigation).