VGT-MOT: Visibility-Guided Tracking for Online Multiple-Object Tracking

Shuai Wang,Wei-Xi Li,Lu Wang,Li-Sheng Xu,Qing-Xu Deng
DOI: https://doi.org/10.1007/s00138-023-01398-y
IF: 2.983
2023-01-01
Machine Vision and Applications
Abstract:Multi-object tracking (MOT) is an important task of computer vision which has a wide range of applications. Existing multi-object tracking methods mostly employ the Kalman filter to predict the object location in the next frame. However, if the video is captured by a camera with significant motion variation or contains objects moving at non-constant speed, the Kalman filter may fail. In addition, although object occlusion has been studied extensively in MOT, it has not been well addressed yet. To deal with these problems, a joint detection and tracking method named visibility-guided tracking for MOT (VGT-MOT) is proposed in this paper. Specifically, to cope with the difficulty of accurate object position estimation caused by drastic camera or object motion variation, VGT-MOT utilizes an adjacent-frame object location prediction network with inter-frame attention to predict the target position in the next frame. To handle object occlusion, VGT-MOT employs the object visibility as a dynamic weight to adaptively fuse the motion and appearance similarities and update the object appearance representation. The proposed VGT-MOT has been evaluated on the MOT16, MOT17 and MOT20 datasets. The results show that VGT-MOT compares favorably against state-of-the-art MOT approaches. The source code of the proposed method is available at https://github.com/wang-ironman/VGT-MOT.
What problem does this paper attempt to address?