Real Time Human Detection by Unmanned Aerial Vehicles

Walid Guettala,Ali Sayah,Laid Kahloul,Ahmed Tibermacine
2024-01-07
Abstract:One of the most important problems in computer vision and remote sensing is object detection, which identifies particular categories of diverse things in pictures. Two crucial data sources for public security are the thermal infrared (TIR) remote sensing multi-scenario photos and videos produced by unmanned aerial vehicles (UAVs). Due to the small scale of the target, complex scene information, low resolution relative to the viewable videos, and dearth of publicly available labeled datasets and training models, their object detection procedure is still difficult. A UAV TIR object detection framework for pictures and videos is suggested in this study. The Forward-looking Infrared (FLIR) cameras used to gather ground-based TIR photos and videos are used to create the ``You Only Look Once'' (YOLO) model, which is based on CNN architecture. Results indicated that in the validating task, detecting human object had an average precision at IOU (Intersection over Union) = 0.5, which was 72.5\%, using YOLOv7 (YOLO version 7) state of the art model \cite{1}, while the detection speed around 161 frames per second (FPS/second). The usefulness of the YOLO architecture is demonstrated in the application, which evaluates the cross-detection performance of people in UAV TIR videos under a YOLOv7 model in terms of the various UAVs' observation angles. The qualitative and quantitative evaluation of object detection from TIR pictures and videos using deep-learning models is supported favorably by this work.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
This paper aims to address the real-time human detection problem on unmanned aerial vehicles (UAVs). Object detection is an important task in computer vision and remote sensing, especially for identifying specific objects in images. Thermal infrared (TIR) multi-scene images and videos captured by drones are two important data sources for public safety. However, due to the small size of the targets, complex scene information, relatively low video resolution, and the lack of publicly available annotated datasets and trained models, the object detection process of these images still remains challenging. The paper proposes a drone-based thermal infrared object detection framework based on the "You Only Look Once" (YOLO) architecture. They collect ground thermal infrared images and videos using FLIR cameras and train the YOLOv7 model (the 7th version of YOLO). Experimental results show that the YOLOv7 model achieves an average precision of 72.5% at IOU=0.5 in the validation task, with a detection speed of approximately 161 frames per second. The study also evaluates the cross-detection performance of the YOLOv7 model for human detection in different drone perspectives. The main contributions of the paper include: (1) the creation of a drone-based human detection dataset; (2) the improvement of the YOLO network structure through transfer learning to expand the receptive field and enhance the detection performance of small human bodies. The experimental section demonstrates a comparison with other methods, proving the advantages of the YOLO architecture in real-time detection and processing speed. Despite the challenges such as the difficulty of detecting small targets and limitations of the dataset, the study shows that YOLOv7 performs well in detecting small-sized human bodies in complex backgrounds. Future work plans to extend to object detection of other categories and attempt to achieve object detection on videos for tracking object movement trajectories.