Cascaded Matching Based on Detection Box Area for Multi-Object Tracking

Songbo Gu,Miaohui Zhang,Qiyang Xiao,Wentao Shi
DOI: https://doi.org/10.1016/j.knosys.2024.112075
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:In the existing tracking-by-detection paradigm, advanced approaches rely on appearance features to establish associations between current detections and trajectories. However, these methods are often plagued by issues such as sluggish tracking performance and suboptimal results, particularly when confronted with the unreliability of the appearance features. Considering these challenges, we propose a novel cascaded matching algorithm called the detection box area-based tracking algorithm (DBAT), which groups the detection boxes by area size and associates detections within each group in a cascaded manner. To enhance the accuracy of grouping, we introduce two crucial components to enhance the quality of detections: the compressed self-decoding module (CSDM) and the task collaboration module (TCM). To acquire more precise location information and augment feature richness, CSDM decomposes the input features into two one-dimensional feature encodings and one two-dimensional feature encoding. Subsequently, these feature encodings perform feature aggregation along both spatial directions to capture long-range dependencies and refine the accuracy of location information. Ultimately, these aggregated features engage with the original features, facilitating information fusion and elevating the overall feature representation. To alleviate potential conflicts between various tasks and bolster task-specific representations, TCM combines disparate receptive fields and decouples features through self-relationship and cross-relationship mappings, thereby concurrently enhancing learning across different tasks. Extensive experiments demonstrate that our proposed method achieves performance comparable to state-of-the-art methods on the MOT17, MOT20 and DanceTrack benchmark tests.
What problem does this paper attempt to address?