Abstract:Unmanned aerial vehicles (UAVs) equipped with remote-sensing object-detection devices are increasingly employed across diverse domains. However, the detection of small, densely-packed objects against complex backgrounds and at various scales presents a formidable challenge to conventional detection algorithms, exacerbated by the computational constraints of UAV-embedded systems that necessitate a delicate balance between detection speed and accuracy. To address these issues, this paper proposes the Efficient Multidimensional Global Feature Adaptive Fusion Network (MGFAFNET), an innovative detection method for UAV platforms. The novelties of our approach are threefold: Firstly, we introduce the Dual-Branch Multidimensional Aggregation Backbone Network (DBMA), an efficient architectural innovation that captures multidimensional global spatial interactions, significantly enhancing feature distinguishability for complex and occluded targets. Simultaneously, it reduces the computational burden typically associated with processing high-resolution imagery. Secondly, we construct the Dynamic Spatial Perception Feature Fusion Network (DSPF), which is tailored specifically to accommodate the notable scale variances encountered during UAV operation. By implementing a multi-layer dynamic spatial fusion coupled with feature-refinement modules, the network adeptly minimizes informational redundancy, leading to more efficient feature representation. Finally, our novel Localized Compensation Dual-Mask Distillation (LCDD) strategy is devised to adeptly translate the rich local and global features from the higher-capacity teacher network to the more resource-constrained student network, capturing both low-level spatial details and high-level semantic cues with unprecedented efficacy. The practicability and superior performance of our MGFAFNET are corroborated by a dedicated UAV detection platform, showcasing remarkable improvements over state-of-the-art object-detection methods, as demonstrated by rigorous evaluations conducted using the VisDrone2021 benchmark and a meticulously assembled proprietary dataset.

CDNet: Object Detection Based on Cross-Level Aggregation and Deformable Attention for UAV Aerial Images

DroneNet: Rescue Drone-View Object Detection

A Small UAV Detection Method Based on Optical Flow and Visual Feature Fusion

A Training-time Friendly Network for Real-time Drone Detection.

APNet: Accurate Positioning Deformable Convolution for UAV Image Object Detection

Deformable Convolution-Guided Multiscale Feature Learning and Fusion for UAV Object Detection

A Novel Object Detection Method in City Aerial Image Based on Deformable Convolutional Networks

UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping

Drone-DETR: Efficient Small Object Detection for Remote Sensing Image Using Enhanced RT-DETR Model

Dense and Small Object Detection in UAV-Vision Based on a Global-Local Feature Enhanced Network

An Efficient UAV Image Object Detection Algorithm Based on Global Attention and Multi-Scale Feature Fusion

MFO-Net: A Multiscale Feature Optimization Network for UAV Image Object Detection

Learnable Cross-Scale Sparse Attention Guided Feature Fusion for UAV Object Detection

AMFEF-DETR: An End-to-End Adaptive Multi-Scale Feature Extraction and Fusion Object Detection Network Based on UAV Aerial Images

Self-Attention Guidance and Multiscale Feature Fusion-Based UAV Image Object Detection

AF-DETR: efficient UAV small object detector via Assemble-and-Fusion mechanism

MFEFNet: A Multi-Scale Feature Information Extraction and Fusion Network for Multi-Scale Object Detection in UAV Aerial Images

A Multi-Scale Object Detector Based on Coordinate and Global Information Aggregation for UAV Aerial Images

PTCDet: advanced UAV imagery target detection

Lightweight Detection Network Based on Sub-Pixel Convolution and Objectness-Aware Structure for UAV Images

Lightweight UAV Object-Detection Method Based on Efficient Multidimensional Global Feature Adaptive Fusion and Knowledge Distillation