Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

Mingzhan Yang,Guangxin Han,Bin Yan,Wenhua Zhang,Jinqing Qi,Huchuan Lu,Dong Wang
2024-01-20
Abstract:Multi-Object Tracking (MOT) aims to detect and associate all desired objects across frames. Most methods accomplish the task by explicitly or implicitly leveraging strong cues (i.e., spatial and appearance information), which exhibit powerful instance-level discrimination. However, when object occlusion and clustering occur, spatial and appearance information will become ambiguous simultaneously due to the high overlap among objects. In this paper, we demonstrate this long-standing challenge in MOT can be efficiently and effectively resolved by incorporating weak cues to compensate for strong cues. Along with velocity direction, we introduce the confidence and height state as potential weak cues. With superior performance, our method still maintains Simple, Online and Real-Time (SORT) characteristics. Also, our method shows strong generalization for diverse trackers and scenarios in a plug-and-play and training-free manner. Significant and consistent improvements are observed when applying our method to 5 different representative trackers. Further, with both strong and weak cues, our method Hybrid-SORT achieves superior performance on diverse benchmarks, including MOT17, MOT20, and especially DanceTrack where interaction and severe occlusion frequently happen with complex motions. The code and models are available at <a class="link-external link-https" href="https://github.com/ymzis69/HybridSORT" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address common issues of occlusion and clustering in multi-object tracking (MOT). Specifically: 1. **Introduction of Weak Cues**: - Weak cues (such as confidence state, height state, and velocity direction) can effectively mitigate the unreliability of strong cues (spatial information and appearance information) in cases of occlusion and clustering. - The paper introduces these weak cues to compensate for the shortcomings of traditional strong cues, thereby improving tracking performance. 2. **Simple and Effective Strategies**: - Two methods, Tracklet Confidence Modeling (TCM) and Height Modulated IoU (HMIoU), are proposed to model and utilize weak cues. - These methods significantly enhance tracking performance while maintaining the characteristics of Simple, Online and Real-Time (SORT). 3. **Generality and Plug-in Design**: - The design features good generality and plug-in characteristics, achieving consistent and significant improvements across different trackers. - Validation was conducted on five representative trackers, including SORT, DeepSORT, MOTDT, ByteTrack, and OC-SORT, with notable improvements. 4. **Improvement of Existing Algorithms**: - The state-of-the-art SORT-like algorithm OC-SORT was improved, further enhancing its performance, especially in the DanceTrack, MOT17, and MOT20 benchmarks. In summary, the main objective of this paper is to overcome the challenges posed by occlusion and clustering by introducing weak cues, thereby enhancing the accuracy and robustness of multi-object tracking.