Generic Knowledge Boosted Pre-training For Remote Sensing Images

Ziyue Huang,Mingming Zhang,Yuan Gong,Qingjie Liu,Yunhong Wang
2024-01-21
Abstract:Deep learning models are essential for scene classification, change detection, land cover segmentation, and other remote sensing image understanding tasks. Most backbones of existing remote sensing deep learning models are typically initialized by pre-trained weights obtained from ImageNet pre-training (IMP). However, domain gaps exist between remote sensing images and natural images (e.g., ImageNet), making deep learning models initialized by pre-trained weights of IMP perform poorly for remote sensing image understanding. Although some pre-training methods are studied in the remote sensing community, current remote sensing pre-training methods face the problem of vague generalization by only using remote sensing images. In this paper, we propose a novel remote sensing pre-training framework, Generic Knowledge Boosted Remote Sensing Pre-training (GeRSP), to learn robust representations from remote sensing and natural images for remote sensing understanding tasks. GeRSP contains two pre-training branches: (1) A self-supervised pre-training branch is adopted to learn domain-related representations from unlabeled remote sensing images. (2) A supervised pre-training branch is integrated into GeRSP for general knowledge learning from labeled natural images. Moreover, GeRSP combines two pre-training branches using a teacher-student architecture to simultaneously learn representations with general and special knowledge, which generates a powerful pre-trained model for deep learning model initialization. Finally, we evaluate GeRSP and other remote sensing pre-training methods on three downstream tasks, i.e., object detection, semantic segmentation, and scene classification. The extensive experimental results consistently demonstrate that GeRSP can effectively learn robust representations in a unified manner, improving the performance of remote sensing downstream tasks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to improve the pre - training effect of deep - learning models in remote - sensing image - understanding tasks by combining the knowledge of natural images and remote - sensing images**. Specifically, the existing pre - training methods for remote - sensing images mainly rely on weights pre - trained from natural - image data sets such as ImageNet (IMP). However, due to the domain gap between remote - sensing images and natural images (such as differences in perspective, resolution, object appearance, etc.), these pre - trained models perform poorly in remote - sensing image - understanding tasks. In addition, methods that only use remote - sensing images for pre - training also face the problem of insufficient generalization ability. To solve these problems, the paper proposes a new pre - training framework - **Generic Knowledge Boosted Remote Sensing Pre - training (GeRSP)**, which aims to learn more robust representations by combining the knowledge of natural images and remote - sensing images, thereby improving the performance of remote - sensing image - understanding tasks. ### Main contributions of GeRSP: 1. **Proposed a new pre - training framework for remote - sensing images**: GeRSP simultaneously learns general knowledge and domain - specific knowledge by combining supervised pre - training and self - supervised pre - training. 2. **Contains two pre - training branches**: - **Self - supervised pre - training branch**: Learns domain - related features from unlabeled remote - sensing images. - **Supervised pre - training branch**: Learns general knowledge from labeled natural images. 3. **Combines the two pre - training branches through a teacher - student architecture**: Ensures that the model can simultaneously learn general knowledge and domain - specific knowledge, generating a powerful pre - trained model. 4. **Evaluated on three downstream tasks**: Including object detection, semantic segmentation, and scene classification. The experimental results show that GeRSP can effectively improve the performance of remote - sensing image - understanding tasks. ### Core ideas of the paper: - **Enhance generalization ability by using the diversity of natural images**: Natural images provide a wide range of general knowledge, while remote - sensing images contain domain - specific knowledge. By combining the two, the deficiencies of a single data source can be compensated for. - **Solve the limitations of existing pre - training methods**: Existing remote - sensing pre - training methods either rely on natural images, ignoring the domain gap, or only use remote - sensing images, resulting in insufficient generalization ability. GeRSP overcomes these limitations through joint training. In this way, GeRSP not only improves the performance of remote - sensing image - understanding tasks but also provides a new idea for future research, that is, how to better combine the knowledge of different domains to improve the generalization ability and robustness of the model.