Hybrid FusionNet: A Hybrid Feature Fusion Framework for Multi-Source High-Resolution Remote Sensing Image Classification

Yongjie Zheng,Sicong Liu,Hao Chen,Lorenzo Bruzzone
DOI: https://doi.org/10.1109/tgrs.2024.3352812
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:With the increasing number of high resolution (HR) images captured by various platforms, integrating spectral and spatial properties of data across different HR image types, such as multispectral (MS), hyperspectral (HS), and multitemporal (MT) images, remains a challenging task for object classification. This paper proposes a novel hybrid framework named Hybrid FusionNet (HFN) that jointly exploits 2D-3D Convolutional Neural Networks (CNNs) and a Transformer encoder to address a complex classification problem. By incorporating 2D and 3D convolutional layers, the proposed HFN generates rich multi-dimensional hybrid features, including spectral, spatial, and temporal features. These features are then fed into a Transformer encoder to learn global saliency and discriminative information, enabling the identification of spatially irregular and spectrally similar objects. The hybrid architecture efficiently captures local intricate spectral-spatial-temporal contextual features through convolutional layers. Then it learns global long-range dependencies and the spectral dimension through the Transformer encoder, thus effectively reducing spectral-spatial mutations, distortions, and variations of ground objects. Experimental results from an HR-MS dataset, an HR-HS dataset, and an HR-MT dataset covering complex urban scenarios confirm the effectiveness of the proposed approach compared to the main state-of-the-art methods. Notably, the proposed HFN can achieve satisfactory classification performance even with limited training samples. The source code will be made available at https://github.com/MissYongjie/Hybrid-FusionNet.
What problem does this paper attempt to address?