Detecting UAV Target with Deep Convolutional Neural Network

Mutagisha Norbelt,Xiling Luo,Jinping Sun,Junjun Wang,Uwimana Claude,Muhammad Wisal
DOI: https://doi.org/10.1145/3686625.3686628
2024-01-01
Abstract:Over the past years, Convolutional Neural Networks (CNNs) have emerged as widely used techniques for tackling various computer vision challenges, with image detection being a popular application. The performance of CNNs in object detection has rapidly improved in terms of both accuracy and speed. Unmanned Aerial Vehicles (UAVs), or drones, have seen exponential growth in adoption across various sectors, from military and surveillance to agriculture and infrastructure inspection. Their ability to capture aerial data and perform remote sensing tasks has revolutionized industries, making them invaluable tools for monitoring, mapping, and data collection. However, UAVs flying near runways can pose a danger to airplanes during take-off and landing. Additionally, drones can be a threat to stadiums, prisons, and military camps due to their ability to carry payloads and bypass ground security. This research concentrates on a particular aspect of object detection: identifying unmanned aerial vehicles (UAVs or drones). The task becomes more demanding due to the data being extracted from video footage captured simultaneously by a visible light camera and an infrared camera. This setup results in difficult viewing angles and significant motion blur. To assess both accuracy and speed, the study selected two models with typical architectures: You Only Look Once (YOLO) and Single Shot MultiBox Detector (SSD). The goal was to evaluate the robustness of these models in detecting UAVs under conditions of motion blur and to understand how the structural differences between the models might influence the results. Several experiments were conducted as part of this work to achieve the stated objectives. The findings likely provide valuable insights into the performance of these object detection models when applied to the challenging scenario of UAV detection from video streams with motion blur. The results showed that for better structural design, YOLOv3 outperformed SSD in various aspects. Further experiments on UAV data from the dataset demonstrated the efficiency of YOLOv3 when trained on more images. Regarding the motion blur problem, the studies demonstrated that the YOLOv3 model has a good ability to recognize and learn blurred visual patterns. The structure of SSD was further tested through the design of default boxes and their performance on different scales and locations. Overall, the results showed that the YOLOv3 model had superior performance compared to SSD for UAV detection in video streams. The experimental results on the Discovery UAVs Dataset confirmed the proposed techniques' value in UAV discovery. The YOLOv3 technique achieved 93% precision and 89.3% recall on the Discovery UAVs test dataset, while the SSD technique achieved 88.9% precision and 86.3% recall on the same dataset.
What problem does this paper attempt to address?