SSL-MOT: self-supervised learning based multi-object tracking

Sangwon Kim,Jimi Lee,Byoung Chul Ko
DOI: https://doi.org/10.1007/s10489-022-03473-9
IF: 5.3
2022-04-22
Applied Intelligence
Abstract:Although the use of a Siamese network is the most popular approach in object tracking, it creates an undesirable trivial solution and requires a large amount of training data reflecting changes in the object's shape in every frame. To solve this problem, in this paper, a self-supervised learning method for multi-object tracking (SSL-MOT) based on a contrastive structure is proposed. Unlike the existing SSL, we adopt a generative adversarial network as a preprocessing step to generate various pose changes of tracking objects. A positive pair composed of the augmented image and pose data is applied to the SSL network to learn an encoder that can generate a non-collapsed output vector. To improve the discrimination power of the encoder output features, we propose an affinity correlation distance, which combines invariance and redundancy terms as a loss function for learning. During the test, because only the dot product between two output vectors of the tracker and detection was used for a data association, the computation time was significantly reduced, and thus real-time online tracking about 12 fps was possible. The proposed method is the first attempt to apply SSL to an online MOT. Experimental results on the MOT16, 17, and 20 challenge datasets proved that the proposed method is a fast and reasonable tracking method that occupies less memory and achieves an excellent tracking performance compared to other state-of-the-art methods.
computer science, artificial intelligence
What problem does this paper attempt to address?