CFANet: Efficient Detection of UAV Image Based on Cross-Layer Feature Aggregation

Yunzuo Zhang,Cunyu Wu,Wei Guo,Tian Zhang,Wei Li
DOI: https://doi.org/10.1109/tgrs.2023.3273314
IF: 8.2
2023-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:With the rapid development of the unmanned aerial vehicle (UAV) industry, UAV image object detection technology has become a hotspot. However, due to a large number of dense small objects in UAV images, quickly and effectively detecting objects and achieving accurate classification is still a challenge. With this observation, we propose an efficient object detection network for UAV images based on cross-layer feature aggregation (CFANet). First, we design a novel cross-layer feature aggregation (CFA) module to aggregate features at different scales based on avoiding semantic gaps, so as to replace common features for feature fusion and achieve accurate detection. This method makes up for the defect that the layer-by-layer feature transfer method only focuses on the features of the previous layer and cannot fully integrate spatial and semantic information. Second, a layered associative spatial pyramid pooling (LASPP) module is proposed to capture context information while maintaining the sensitivity of feature maps at different layers to detail information. Third, the alpha-intersection over union (IoU) loss function is introduced to accelerate the convergence speed of the model and improve the detection accuracy. Finally, an adaptive overlapping slice (AOS) for high-resolution images is proposed to protect the integrity of the object when slicing. To verify the effectiveness of the proposed method, extensive experiments on challenge datasets for object detection in UAV images VisDrone2021 and the unmanned aerial vehicle benchmark: object detection and tracking (UAVDT) datasets are carried out. The results show that, compared with the other most advanced detectors, the proposed method can achieve significant performance based on ensuring real-time detection.
What problem does this paper attempt to address?