Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network.

Cong Ma,Yuan Li,Fan Yang,Ziwei Zhang,Yueqing Zhuang,Huizhu Jia,Xiaodong Xie
DOI: https://doi.org/10.1145/3323873.3325010
2019-01-01
Abstract:Multiple Object Tracking (MOT) has a wide range of applications in surveillance retrieval and autonomous driving. The majority of existing methods focus on extracting features by deep learning and hand-crafted optimizing bipartite graph or network flow. In this paper, we proposed an efficient end-to-end model, Deep Association Network (DAN), to learn the graph-based training data, which are constructed by spatial-temporal interaction of objects. DAN combines Convolutional Neural Network (CNN), Motion Encoder (ME) and Graph Neural Network (GNN). The CNNs and Motion Encoders extract appearance features from bounding box images and motion features from positions respectively, and then the GNN optimizes graph structure to associate the same object among frames together. In addition, we presented a novel end-to-end training strategy for Deep Association Network. Our experimental results demonstrate the effectiveness of DAN up to the state-of-the-art methods without extra-dataset on MOT16 and DukeMTMCT.
What problem does this paper attempt to address?