Foundation Model-Based Multimodal Remote Sensing Data Classification

Xin He,Yushi Chen,Lingbo Huang,Danfeng Hong,Qian Du
DOI: https://doi.org/10.1109/tgrs.2023.3344698
IF: 8.2
2023-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:With the increasing availability and openness of remote sensing (RS) data collected from diverse sensors, there has been a growing interest in multimodal RS data classification. Nowadays, in the area of deep learning, there is a paradigm shift with the rise of foundation models, which are trained on large-scale datasets and are adaptable to a wide range of downstream tasks. In this study, the potential and effectiveness of foundation models for multimodal RS data classification is investigated. The training datasets of foundation models and multimodal RS datasets are quite different, and therefore, it is difficult to use a pretrained foundation model for multimodal RS data classification directly. To mitigate this difficulty, this article proposes a foundation model adaptation (FMA) framework for multimodal RS data classification without fine-tuning the parameters. Specifically, two learnable modules, i.e., cross-spatial interaction module and cross-channel interaction module, are proposed to add to the foundation model for extracting multimodal-specific representations. The cross-spatial and cross-channel interaction modules extract the characteristics of unimodal features along the spatial dimension and channel dimension, respectively. To effectively tackle the disparities among various RS modalities, an alignment approach (FMA2) is further explored based on the FMA. The FMA2 describes dependencies between different modalities by establishing a coupling score function, which can further enhance classification performance. To demonstrate the effectiveness and superiority of the FMA framework, comprehensive experiments are conducted on three multimodal RS datasets, showing improvement over the advanced multimodal RS data classification image methods.
What problem does this paper attempt to address?