Object Detection from UAV Thermal Infrared Images and Videos Using YOLO Models.

Chenchen Jiang,Huazhong Ren,Xin Ye,Jinshun Zhu,Hui Zeng,Yang Nan,Min Sun,Xiang Ren,Hongtao Huo
DOI: https://doi.org/10.1016/j.jag.2022.102912
IF: 7.5
2022-01-01
International Journal of Applied Earth Observation and Geoinformation
Abstract:Object detection is one of the most crucial tasks in computer vision and remote sensing to identify specific categories of various objects in images. The unmanned aerial vehicle (UAV)-based thermal infrared (TIR) remote sensing multi-scenario images and videos are two important data sources in public security. However, their object detection process is still challenging because of the complicated scene information, coarse resolution compared with the visible videos and lack of public labelled datasets and training models. This study proposed a UAV TIR object detection framework for images and videos. The You Only Look Once (YOLO) models based on Convolutional Neural Network (CNN) architecture were designed to extract features from ground-based TIR images and videos, which were captured by Forward-looking Infrared (FLIR) cameras. The most effective al-gorithm was finally identified by evaluation metrics and then applied to detect objects on TIR videos from UAVs. Results showed that the highest mean average precision (mAP) of the person and car instances was 88.69% in the validating task. The fastest detection speed achieved 50 frames per second (FPS), and the smallest model size was observed in YOLOv5-s. In the application, the cross-detection performance on persons and cars in UAV TIR videos under a YOLOv5-s model was discussed in terms of the different UAVs' observation angles and the effectiveness of the YOLO architecture was revealed. This study provides positive support for the qualitative and quantitative evaluation of objection detection from TIR images and videos using deep-learning models.
What problem does this paper attempt to address?