Enhanced Spectral-Spatial Fusion Network for Multispectral Object Detection in Ground-Aerial Images

Fengxiang Xu,Tingfa Xu,Lang Hong,Peiran Peng,Jiaxin Guo,Jianan Li
DOI: https://doi.org/10.1109/lgrs.2024.3440045
IF: 5.343
2024-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:In recent years, multispectral object detection technology has gained widespread attention due to its significant performance in detecting objects with similar colors or textures in complex environments. The mainstream methods adopt the fusion of visible light (RGB) and thermal images to make up for the shortcomings of a single modality, improving the detection accuracy. However, most methods fail to take the inherent differences between modalities into account. RGB and thermal images differ in object attributes, which may lead to inconsistent contribution of each modality to the fusion features. Hence, equally feeding them into the feature extractor will limit the expressiveness of the fusion features. In order to reasonably utilize complementary information cues of each modality, an effective cross-modality feature fusion network is proposed in this letter. It comprises a spectral-spatial enhancing module (SSE) and a feature fusion module via Transformer (FFT).For dual-modality data, in the aerial dataset, our model’s detection accuracy metrics, mAP50 and mAP, are respectively improved by 1.2% and 0.9% compared to the best dual-modality network.Comprehensive experiments on both ground and aerial datasets demonstrate that our approach outperforms existing methods. The achievements are of great significance for enhancing the robustness and accuracy of multispectral object detection.
What problem does this paper attempt to address?