The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking

Dawei Du,Yuankai Qi,Hongyang Yu,Yifan Yang,Kaiwen Duan,Guorong Li,Weigang Zhang,Qingming Huang,Qi Tian
DOI: https://doi.org/10.48550/arXiv.1804.00518
2018-03-26
Abstract:With the advantage of high mobility, Unmanned Aerial Vehicles (UAVs) are used to fuel numerous important applications in computer vision, delivering more efficiency and convenience than surveillance cameras with fixed camera angle, scale and view. However, very limited UAV datasets are proposed, and they focus only on a specific task such as visual tracking or object detection in relatively constrained scenarios. Consequently, it is of great importance to develop an unconstrained UAV benchmark to boost related researches. In this paper, we construct a new UAV benchmark focusing on complex scenarios with new level challenges. Selected from 10 hours raw videos, about 80,000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e.g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking. Then, a detailed quantitative study is performed using most recent state-of-the-art algorithms for each task. Experimental results show that the current state-of-the-art methods perform relative worse on our dataset, due to the new challenges appeared in UAV based real scenes, e.g., high density, small object, and camera motion. To our knowledge, our work is the first time to explore such issues in unconstrained scenes comprehensively.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced in object detection and tracking on the Unmanned Aerial Vehicle (UAV) platform. Specifically, the paper points out that most of the current data sets are composed of videos captured by cameras on fixed or moving vehicles, and these videos have limited perspectives in surveillance scenarios. Due to its high mobility and wide field of view, UAVs provide new possibilities for computer vision tasks, but also bring new challenges, such as high - density targets, small - target detection, and the impact of camera movement. Therefore, this paper aims to construct a large - scale and challenging UAV Detection and Tracking Benchmark (UAVDT) to promote the development of related research and evaluate the performance of existing algorithms under these new challenges. The main contributions of the paper include: 1. **Data set construction**: A comprehensively annotated data set has been collected, covering three basic tasks: object detection (DET), single - object tracking (SOT), and multi - object tracking (MOT). This data set contains approximately 80,000 representative frames, selected from 10 hours of raw video, and annotated with 14 attributes (such as weather conditions, flight altitude, camera view, vehicle type, and occlusion, etc.). 2. **Algorithm evaluation**: An extensive evaluation has been carried out using the latest advanced algorithms for each task, especially the performance evaluation under various attributes. Through these contributions, the paper hopes to promote the research on object detection and tracking of UAVs in complex scenarios and provide more powerful technical support for practical applications.