Object Detection in Thermal Images Using Deep Learning for Unmanned Aerial Vehicles

Minh Dang Tu,Kieu Trang Le,Manh Duong Phung
DOI: https://doi.org/10.1109/SII58957.2024.10417611
2024-02-13
Abstract:This work presents a neural network model capable of recognizing small and tiny objects in thermal images collected by unmanned aerial vehicles. Our model consists of three parts, the backbone, the neck, and the prediction head. The backbone is developed based on the structure of YOLOv5 combined with the use of a transformer encoder at the end. The neck includes a BI-FPN block combined with the use of a sliding window and a transformer to increase the information fed into the prediction head. The prediction head carries out the detection by evaluating feature maps with the Sigmoid function. The use of transformers with attention and sliding windows increases recognition accuracy while keeping the model at a reasonable number of parameters and computation requirements for embedded systems. Experiments conducted on public dataset VEDAI and our collected datasets show that our model has a higher accuracy than state-of-the-art methods such as ResNet, Faster RCNN, ComNet, ViT, YOLOv5, SMPNet, and DPNetV3. Experiments on the embedded computer Jetson AGX show that our model achieves a real-time computation speed with a stability rate of over 90%.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to accurately identify small and tiny objects in thermal imaging images collected by unmanned aerial vehicles (UAVs)**. Specifically, the author focuses on: 1. **Problems of low - resolution, uneven thermal background and high noise**: Thermal imaging images usually have a relatively low amount of information, especially when representing objects with large temperature changes (such as ships or parked vehicles), the information is more limited. 2. **Object clustering problem**: When multiple objects are close to each other, the images taken by UAVs may recognize these objects as a whole. To address these problems, the author proposes a neural network model based on deep learning, which can more accurately identify small and tiny objects in thermal imaging images. The main improvements of the model include: - **Introducing the attention mechanism**: Extract feature maps through multi - dimensional pyramids and use the attention mechanism to enrich data and expand the information area, thereby improving the recognition ability of small objects. - **Backbone optimization**: Use the GhostConv method to reduce the number of parameters, and at the same time extract additional information through the Bottleneck net to deal with the problem of low information content in thermal imaging images. - **Combining sliding window and Transformer**: Apply the sliding window and self - attention mechanism on the feature map to quickly detect objects and reduce computational complexity. Through these improvements, the model not only improves the recognition accuracy, but also can realize real - time calculation on embedded systems (such as UAVs). Experimental results show that the performance of this model on the public data set VEDAI and other self - collected data sets is better than existing advanced methods, such as ResNet, Faster RCNN, ComNet, ViT, YOLOv5, etc.