Abstract:Multiple Object Tracking is an important vitally important fundamental task in computer vision. Visual tracking becomes challenging when objects move in groups and are obscured from each other. There are two mainstream solution strategies for these group models. One is to transform the data association problem into a graph matching problem for solving, while the other is to apply the social power model as an advanced constraint for group tracking. In the former case, the solving difficulty geometric growth as the number of tracked objects increases, and the computing efficiency for real-time tracking demand cannot be met. The latter strategy tends to set up fixed-size groups or offline training rules, resulting in a lack of flexibility that limits their scenario generalization. According to the shortcomings of existing methods, this paper proposes a novel multiple object tracking method with spatio-temporal correlation and graph neural networks. Firstly, the relational features of the historical trajectories are extracted through the spatio-temporal relationship learning module, which models the spatio-temporal correlations of the objects and dynamically constructs the group structure online. Then, the graph neural network is combined with appearance and motion information, and the similarity between each detection and tracklet is used as a weight in node feature aggregation to make powerful distinctions between node features. Meanwhile, the spatio-temporal correlation method is also used to solve target loss issues caused by occlusion. Even collocated with linearly assigned data association method, good tracking results are still achieved, with a low computational complexity. Experiments on three challenging public datasets, namely MOT16, MOT17, and MOT20, validated the accuracy and efficiency of the proposed tracking method.

Graph Convolutional Tracking

Exploit Spatiotemporal Contextual Information for 3D Single Object Tracking Via Memory Networks

Graph Attention Network for Context-Aware Visual Tracking

RASTMTrack: Robust and Adaptive Space-Time Memory Networks for Visual Tracking

Transformer Union Convolution Network for Visual Object Tracking

Graph Attention Tracking

TGCN: Time Domain Graph Convolutional Network for Multiple Objects Tracking

SGAT: Shuffle and graph attention based Siamese networks for visual tracking

Siamese Graph Attention Networks for Robust Visual Object Tracking.

Continuity-Discrimination Convolutional Neural Network for Visual Object Tracking

Dynamic Spatio-Temporal Feature Learning via Graph Convolution in 3D Convolutional Networks

CTT: CNN Meets Transformer for Tracking

SCGTracker: Spatio-temporal correlation and graph neural networks for multiple object tracking

Towards Real-World Visual Tracking with Temporal Contexts

The Multi-task Fully Convolutional Siamese Network with Correlation Filter Layer for Real-Time Visual Tracking

Online Video Tracking Using Collaborative Convolutional Networks

Target-Aware Tracking with Long-term Context Attention

STGL: Spatial-Temporal Graph Representation and Learning for Visual Tracking

UCT: Learning Unified Convolutional Networks for Real-time Visual Tracking

Spatial graph attention network-based object tracking with adaptive cosine window