SMP-Track: SAM in Multi-Pedestrian Tracking

Shiyin Wang,Huafeng Liu,Qiong Wang,Yichao Zhou,Yazhou Yao
DOI: https://doi.org/10.1109/dsaa61799.2024.10722829
2024-01-01
Abstract:Multiple Object Tracking (MOT) plays a crucial role in security data analysis as a fundamental problem for video surveillance. Our goal is to design a robust tracker for data damaged by attacks, while also emphasizing privacy protection. The mainstream paradigm for MOT is tracking-by-detection (TBD), which involves object detection followed by target association. In the association stage, most models rely on Intersection over Union (IoU) similarity of bounding boxes for short-range matching and cosine similarity of appearance features for long-range matching. However, both of these similarities contain a lot of redundant background regions except the target. To this end, we propose a new tracker named SMP-Track that integrates Segment Anything Model (SAM) into Multi-Pedestrian tracking method. Firstly we extract the pedestrian masks based on box prompt, focusing solely on the foreground information. Then we introduce a new similarity metric that combines the advantages of motion and foreground information (i.e., box-mask similarity). Extensive experiments demonstrate that SMP-Track increases main metrics on the MOT17 validation set, and achieves comparable performance to other state-of-the-art methods on the MOT17 and MOT20 test sets. Furthermore, by incorporating pedestrian masks, we reduce reliance on raw pedestrian images or features, making the model robust to corrupted data and mitigating the risk of privacy leakage.
What problem does this paper attempt to address?