YolTrack: Multitask Learning Based Real-Time Multiobject Tracking and Segmentation for Autonomous Vehicles

Xuepeng Chang,Huihui Pan,Weichao Sun,Huijun Gao
DOI: https://doi.org/10.1109/tnnls.2021.3056383
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Modern autonomous vehicles are required to perform various visual perception tasks for scene construction and motion decision. The multiobject tracking and instance segmentation (MOTS) are the main tasks since they directly influence the steering and braking of the car. Implementing both tasks using a multitask learning neural network presents significant challenges in performance and complexity. Current work on MOTS devotes to improve the precision of the network with a two-stage tracking by detection model, which is difficult to satisfy the real-time requirement of autonomous vehicles. In this article, a real-time multitask network named YolTrack based on one-stage instance segmentation model is proposed to perform the MOTS task, achieving an inference speed of 29.5 frames per second (fps) with slight accuracy and precision drop. The YolTrack uses ShuffleNet V2 with feature pyramid network (FPN) as a backbone, from which two decoders are extended to generate instance segments and embedding vectors. Segmentation masks are used to improve the tracking performance by performing logic AND operation with feature maps, proving that foreground segmentation plays an important role in object tracking. The different scales of multiple tasks are balanced by the optimized geometric mean loss during the training phase. Experimental results on the KITTI MOTS data set show that YolTrack outperforms other state-of-the-art MOTS architectures in real-time aspect and is appropriate for deployment in autonomous vehicles.
What problem does this paper attempt to address?