SCDFMixer: Spatial–Channel Dual-Frequency Mixer Based on Satellite Optical Sensors for Remote Sensing Multiobject Detection

Ben Liang,Jia Su,Kangkang Feng,Yanming Liu,Dongwen Zhang,Weimin Hou
DOI: https://doi.org/10.1109/jsen.2023.3348097
IF: 4.3
2024-02-16
IEEE Sensors Journal
Abstract:Object detection and recognition in high-resolution remote sensing images captured by satellite optical sensors from a bird's-eye view perspective has been a challenging task with considerable practical value. Unlike natural scenes, several inherent characteristics of remote sensing scenes including irregular scale variations and orientation, intricate contextual textures with diverse ground materials and domain gaps, and dense distribution of tiny fuzzy objects have severely hampered the development of detection accuracy for geospatial object analysis. To robustly address these key issues, we propose a spatial-channel dual-frequency mixer (SCDFMixer) specifically designed to enhance the precision of localization and categorization for remote sensing detection and recognition (RSDR) tasks. First, we design a novel spatial-channel mixer (SCMixer) backbone network aimed at capturing sufficient long-range context priors and distant spatial dependencies and intrachannel relationships for remote sensing imagery. Second, to more effectively suppress interference from complex backgrounds and enhance adaptability to large-scale variations, we propose a scale-adaptive perception aggregator (SPA) that uses multiscale contexts in a scale-aware manner and effectively captures both localized semantics and extensive surroundings to generate robust and discriminative representations optimized for remote sensing object detection. Finally, a novel generalized dual-frequency balance (DFB) module is introduced to simultaneously extract integrated hybrid features encoding both global low-frequency and local high-frequency cues. Extensive experiments on three publicly available remote sensing datasets, DIOR, RSOD, and NWPU VHR-10, show that the proposed SCDFMixer outperforms other methods to achieve a state-of-the-art detection performance, with a mean average precision (mAP) of 82.7%, 96.2%, and 96.0%, respectively.
engineering, electrical & electronic,instruments & instrumentation,physics, applied
What problem does this paper attempt to address?