Abstract:To address the challenge of requiring a large amount of manually annotated data for semantic segmentation of remote sensing images using deep learning, a method based on self-supervised learning is proposed. Firstly, to simultaneously learn the global and local features of remote sensing images, a self-supervised learning network structure called TBSNet (Triple-Branch Self-supervised Network) is constructed. This network comprises an image transformation prediction branch, a global contrastive learning branch, and a local contrastive learning branch. The contrastive learning part of the network employs a novel data augmentation method to simulate positive pairs of the same remote sensing images under different weather conditions, enhancing the model's performance. Meanwhile, the model integrates channel attention and spatial attention mechanisms in the projection head structure of the global contrastive learning branch, and replaces a fully connected layer with a convolutional layer in the local contrastive learning branch, thus improving the model's feature extraction ability. Secondly, to mitigate the high computational cost during the pre-training phase, an algorithm optimization strategy is proposed using the TracIn method and sequential optimization theory, which increases the efficiency of pre-training. Lastly, by fine-tuning the model with a small amount of annotated data, effective semantic segmentation of remote sensing images is achieved even with limited annotated data. The experimental results indicate that with only 10% annotated data, the overall accuracy (OA) and recall of this model have improved by 4.60% and 4.88% respectively, compared to the traditional self-supervised model SimCLR (A Simple Framework for Contrastive Learning of Visual Representations). This provides significant application value for tasks such as semantic segmentation in remote sensing imagery and other computer vision domains.

Autonomous Learning of Semantic Segmentation from Internet Images

WebSeg: Learning Semantic Segmentation from Web Searches.

Learning Pixel-wise Labeling from the Internet Without Human Interaction

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation

Weakly Supervised Semantic Segmentation Based on Co-segmentation.

Webly-supervised semantic segmentation via curriculum learning

Weakly-supervised Semantic Segmentation Via Online Pseudo-Mask Correcting

Learning from Pixel-Level Label Noise: A New Perspective for Semi-Supervised Semantic Segmentation.

Learning to Segment with Image-Level Annotations

Semantic Connectivity-Driven Pseudo-labeling for Cross-domain Segmentation

ECS-Net: Improving Weakly Supervised Semantic Segmentation by Using Connections Between Class Activation Maps

Learning to Exploit the Prior Network Knowledge for Weakly-Supervised Semantic Segmentation

Research on Semantic Segmentation Method of Remote Sensing Image Based on Self-supervised Learning

Learning Effectively from Noisy Supervision for Weakly Supervised Semantic Segmentation.

Erase then Grow: Generating Correct Class Activation Maps for Weakly-Supervised Semantic Segmentation

Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups

Semi-Supervised Semantic Segmentation of Remote Sensing Images With Iterative Contrastive Network

Semantic Segmentation for Multi-Scene Remote Sensing Images with Noisy Labels Based on Uncertainty Perception.

CSENet: Cascade Semantic Erasing Network for Weakly-Supervised Semantic Segmentation

Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating Open-Set Noise and Utilizing Hard Examples