Abstract:Unmanned aerial vehicle (UAV) image target detection holds significant value for a wide range of applications in modern society. However, due to the variable flight altitude of UAV, the captured images often exhibit significant differences at the target scale and contain a large number of small targets. The existing methods are difficult to adapt to these changes, resulting in a decrease in detection accuracy. To address this issue, this article proposes a new method for UAV image object detection based on full-scale feature aggregation (FFA) and grouped feature reconstruction FFAGRNet. First, existing feature fusion methods are hindered by the layer-by-layer transfer structure, which limits effective information exchange between feature maps of different scales. In response, we propose the FFA module, which performs scale adaptation and information aggregation across multiple sets of feature maps, producing high-quality aggregated feature maps. Second, to further refine aggregation features and eliminate redundancy, we introduce the grouping feature reconstruction (GFR) module. This module subdivides aggregation features into multiple sublevel features, allowing them to autonomously learn channel and spatial layouts of target features. Finally, we present the parallel super-resolution semantic enhancement (PSSE) module to reconstruct deep feature maps and incorporate spatial contextual information, effectively increasing the proportion of semantic information and enhancing the model's ability to classify ambiguous targets. To validate the effectiveness of our proposed method, extensive experiments were conducted on the VisDrone2021 and UAVDT datasets. The results demonstrate that compared with the baseline, our method achieves a significant improvement in mAP50, with increases of 7.6% and 4.6%, respectively, showcasing excellent performance compared with existing methods.

Dual-Stream Feature Aggregation Network for Unmanned Aerial Vehicle Aerial Images Semantic Segmentation

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

An Image Segmentation Method Based on Transformer and Multi-Scale Feature Fusion for UAV Marine Environment Monitoring

Multi-scale Feature Extraction and Fusion Net: Research on UAVs Image Semantic Segmentation Technology

Deep Feature Fusion for High-Resolution Aerial Scene Classification

BFANet: Effective Segmentation Network for Low Altitude High-Resolution Urban Scene Image

Dense Connectivity Based Two-Stream Deep Feature Fusion Framework for Aerial Scene Classification

Dual attention deep fusion semantic segmentation networks of large-scale satellite remote-sensing images

Aerial-BiSeNet: A real-time semantic segmentation network for high resolution aerial imagery

AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation

AF2: Adaptive Focus Framework for Aerial Imagery Segmentation

Dual-Path Geometry-Aware Network for Semantic Segmentation of High-Resolution Aerial Images

Full-Scale Feature Aggregation and Grouping Feature Reconstruction-Based UAV Image Target Detection

DSNet:Multi-resolution Dense Encoder and Stack Decoder Network for Aerial Image Segmentation

Attention-Guided Multi-Scale Fusion Network for Similar Objects Semantic Segmentation

(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network

Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion

DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

Efficient Depth Fusion Transformer for Aerial Image Semantic Segmentation

An Attention-Fused Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

Efficient Multi-scale Network for Semantic Segmentation of fine-Resolution Remotely Sensed Images