Conditional GAN Based Individual and Global Motion Fusion for Multiple Object Tracking in UAV Videos

Hongyang Yu,Guorong Li,Li Su,Bineng Zhong,Hongxun Yao,Qingming Huang
DOI: https://doi.org/10.1016/j.patrec.2019.12.018
IF: 4.757
2019-01-01
Pattern Recognition Letters
Abstract:Multiple Object Tracking (MOT) meets great challenges in videos captured by Unmanned Aerial Vehicles (UAVs). Different from traditional videos, due to high altitude and abrupt motion changes of UAVs, the sizes of target objects in UAVs videos are usually very small and the appearance information of target objects is unreliable. The motion analysis is meaningful to associate multiple objects in UAV videos. However, the traditional motion analysis models inevitably suffer from the autonomous motion of UAVs. In this paper, we proposed a Conditional Generative Adversarial Networks (GAN) based model to predict complex motions in UAV videos. We regard the objects motions and the UAV movement as the individual motions and global motions respectively. They are complementary with each other and are employed jointly to facilitate accurate motion prediction. Specifically, a social Long Short Term Memory network is exploited to estimate the individual motion of objects, and a Siamese network is constructed to generate the global motion to reflect the view changes from UAVs, and a conditional GAN is developed to generate the final motion affinity. Extensive experimental results are conducted on public UAV datasets contained various types of objects and 4 different kinds of object detection inputs. Robust motion prediction and improved MOT performance are achieved compared with state-of-the-art methods. (c) 2019 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?