Research on small objects detection algorithm of UAV photography based on improved YOLOv7

XuLiang Duan,BuYuan Zhang,QinWen Deng,HongYang Ma,Bing Yang
DOI: https://doi.org/10.21203/rs.3.rs-4302780/v1
2024-01-01
Abstract:Abstract Unmanned Aerial Vehicles (UAVs) capture aerial photographs with a wide viewing angle, variable backgrounds, and high-speed motion imaging. Object detection in UAV aerial images is challenging due to significant changes in object scale, small and mutually occluded objects, and lack of feature information. Conventional object detection algorithms have poor real-time performance and accuracy in this field. The YOLO algorithm is prone to high false detection and omission rates for small objects in complex scenes, leading to poor detection accuracy. To address these issues, we propose the PCSM-YOLOv7 algorithm, which is an improvement on the YOLOv7 model. To address the challenge of locating small objects with low resolution, we have incorporated a higher resolution small object detection feature layer at the Neck of this model. Additionally, we have added a detection head P2 that targets a feature map of 160×160 pixels. This expands the detection range of the model and enhances the semantic information. To achieve high throughput and low latency on GPUs for model detection, the P4 and P5 modules of the Backbone now use partial convolution (PConv) instead of convolution (Conv). Furthermore, the Backbone includes the coordinate attention (CA) mechanism to improve the model's detection to small objects and reduce the error detection rate. To address the issue of slow convergence of loss functions, we propose a novel similarity comparative metric, the Minimum Point Distance Intersection over Union (MPDIoU) loss function, for boundary box regression (BBR). We tested our algorithm on the VisDrone2019 public dataset for UAV photography. The experimental results demonstrate that our improved model has an mAP@0.5 value of 52.3%, which is 3.4 percentage points higher than the benchmark model's mAP@0.5 of 48.9%. This improvement enhances the accuracy of detecting small objects in UAV photography.
What problem does this paper attempt to address?