Transfer Representation Learning Meets Multimodal Fusion Classification for Remote Sensing Images

Mengru Ma,Wenping Ma,Licheng Jiao,Xu Liu,Fang Liu,Lingling Li,Shuyuan Yang,Biao Hou
DOI: https://doi.org/10.1109/tgrs.2022.3215177
IF: 8.2
2022-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:To maximize the complementary advantages of synergistic multimodal, a transfer representation learning fusion network (TRLF-Net) is proposed for multisource remote sensing images collaborative classification in this article. First, with respect to the feature encoding, we design a dual-branch attention sparse transfer module (DAST-Module), which combines the spatial and channel attention (CA) masks to migrate the advantage attributes of the panchromatic (PAN) and the MS images mutually. This not only enhances their respective image advantages but also facilitates the sparse fusion of low-level features. Second, for the separation of multiscale information, a deep dual-scale decomposition module (DDSD-Module) is designed, which allows the decompose of high-frequency and low-frequency components. Then it uses the decomposed information to make the essential difference as small as possible, and the surrounding contour difference is as large as possible of the complementary multimodal image through the design of the loss function. Finally, to address the problem of large intraclass and small interclass differences, we develop a representation fusion of the global and local features' module (RFGAL-Module). It mainly adopts global features to sort local features within classes, and then outputs them in a cascade. Thus, the characterization ability of features is improved, and the global and local features are used in a coordinated manner to accomplish the sample classification tasks. In particular, the experimental results demonstrate that TRLF-Net can obtain much improved accuracy and efficiency. The code is accessible in: https://github.com/ru-willow/SRLF-Net.
What problem does this paper attempt to address?