Abstract:Recently, supervised deep learning has achieved a great success in remote sensing image (RSI) semantic segmentation. However, supervised learning for semantic segmentation requires a large number of labeled samples, which is difficult to obtain in the field of remote sensing. A new learning paradigm, self-supervised learning (SSL), can be used to solve such problems by pretraining a general model with a large number of unlabeled images and then fine-tuning it on a downstream task with very few labeled samples. Contrastive learning is a typical method of SSL that can learn general invariant features. However, most existing contrastive learning methods are designed for classification tasks to obtain an image-level representation, which may be suboptimal for semantic segmentation tasks requiring pixel-level discrimination. Therefore, we propose a global style and local matching contrastive learning network (GLCNet) for RSI semantic segmentation. Specifically, first, the global style contrastive learning module is used to better learn an image-level representation, as we consider that style features can better represent the overall image features. Next, the local features matching the contrastive learning module is designed to learn the representations of local regions, which is beneficial for semantic segmentation. We evaluate four RSI semantic segmentation datasets, and the experimental results show that our method mostly outperforms the state-of-the-art self-supervised methods and the ImageNet pretraining method. Specifically, with 1% annotation from the original dataset, our approach improves Kappa by 6% on the International Society for Photogrammetry and Remote Sensing (ISPRS) Potsdam dataset relative to the existing baseline. Moreover, our method outperforms supervised learning methods when there are some differences between the datasets of upstream tasks and downstream tasks. Our study promotes the development of SSL in the field of RSI semantic segmentation. Since SSL could directly learn the essential characteristics of data from unlabeled data, which is easy to obtain in the remote sensing field, this may be of great significance for tasks such as global mapping. The source code is available at https://github.com/GeoX-Lab/G-RSIM .

Multimodal Supervised Contrastive Learning in Remote Sensing Downstream Tasks

Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method.

Remote Sensing Images Semantic Segmentation with General Remote Sensing Vision Model via a Self-Supervised Contrastive Learning Method

Enhancing Hyperspectral Image Prediction with Contrastive Learning in Low-Label Regime

LaST: Label-Free Self-Distillation Contrastive Learning with Transformer Architecture for Remote Sensing Image Scene Classification

Cross-Modal Contrastive Learning for Remote Sensing Image Classification.

Saliency Guided Contrastive Learning on Scene Images

Multiple Embeddings Contrastive Pretraining for Remote Sensing Image Classification

Multilabel-Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining

Spatial and Semantic Consistency Contrastive Learning for Self-Supervised Semantic Segmentation of Remote Sensing Images

Confidence-Weighted Dual-Teacher Networks With Biased Contrastive Learning for Semi-Supervised Semantic Segmentation in Remote Sensing Images

A Unified Contrastive Loss for Self-Training

Multiform Ensemble Self-Supervised Learning for Few-Shot Remote Sensing Scene Classification

Contrastive Learning for Urban Land Cover Classification With Multimodal Siamese Network

Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining

Contrastive Learning Based on Multiscale Hard Features for Remote-Sensing Image Scene Classification.

CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation

Understanding Dark Scenes by Contrasting Multi-Modal Observations

Joint Learning of Semantic Segmentation and Height Estimation for Remote Sensing Image Leveraging Contrastive Learning

Global and Local Contrastive Self-Supervised Learning for Semantic Segmentation of HR Remote Sensing Images

DMF-CL: Dense Multi-scale Feature Contrastive Learning for Semantic Segmentation of Remote-Sensing Images