Residual Spatial Reduced Transformer Based on YOLOv5 for UAV Images Object Detection

Li Chen,Naimeng Cang,Wenbo Zhang,Chan Zhang,Weidong Zhang,Dongsheng Guo
DOI: https://doi.org/10.1142/s0218001424500071
IF: 1.261
2024-01-01
International Journal of Pattern Recognition and Artificial Intelligence
Abstract:Object detection on unmanned aerial vehicle (UAV) images is an important branch of object detection, belonging to small object detection in a broad sense. Detecting objects in UAV images poses a greater challenge due to the predominance of small objects and dense occlusion caused by UAV capturing images from varying heights and angles. To solve the above problems, we propose Residual Spatial Reduced Transformer based on YOLOv5 (RSRT-YOLOv5). Specifically, Slice Aided Enhancement Module (SAEM) is introduced to enhance the feature quality of small objects. Secondly, a Global attention-based Bi-directional Feature Fusion (GBFF) module is proposed. In the Neck architecture, an efficient Residual Spatial Reduced Transformer (RSRT) module is integrated in order to achieve more efficient feature representation and richer global contextual associations. Finally, our method is evaluated on the Visdrone2019 dataset, and the experimental results show that RSRT-YOLOv5 outperforms the baseline model (yolov5) and successfully improves the detection performance of UAV images.
What problem does this paper attempt to address?