Multi-view Remote Sensing Image Segmentation With SAM priors

Zipeng Qi,Chenyang Liu,Zili Liu,Hao Chen,Yongchang Wu,Zhengxia Zou,Zhenwei Sh

2024-05-23

Abstract:Multi-view segmentation in Remote Sensing (RS) seeks to segment images from diverse perspectives within a scene. Recent methods leverage 3D information extracted from an Implicit Neural Field (INF), bolstering result consistency across multiple views while using limited accounts of labels (even within 3-5 labels) to streamline labor. Nonetheless, achieving superior performance within the constraints of limited-view labels remains challenging due to inadequate scene-wide supervision and insufficient semantic features within the INF. To address these. we propose to inject the prior of the visual foundation model-Segment Anything(SAM), to the INF to obtain better results under the limited number of training data. Specifically, we contrast SAM features between testing and training views to derive pseudo labels for each testing view, augmenting scene-wide labeling information. Subsequently, we introduce SAM features via a transformer into the INF of the scene, supplementing the semantic information. The experimental results demonstrate that our method outperforms the mainstream method, confirming the efficacy of SAM as a supplement to the INF for this task.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to achieve higher - quality image segmentation with limited labeled data in the multi - view remote sensing image segmentation task. Specifically, although existing methods utilize 3D information extracted from implicit neural fields (INF) to enhance the consistency of multi - view results, under the constraint of limited - view labels, it is difficult to achieve excellent performance due to insufficient supervision within the scene range and insufficient semantic features in INF. To solve these problems, the author proposes a new method. By introducing the prior knowledge of a large - scale visual foundation model - Segment Anything Model (SAM) into INF, better segmentation results can be obtained with a limited amount of training data. This method includes two stages: first, construct the INF of the scene, and then integrate SAM features into the INF through the Transformer mechanism to supplement semantic information, and generate pseudo - labels by comparing SAM features between the test view and the training view, thereby enhancing the labeling information within the scene range. Experimental results show that this method outperforms mainstream methods in multi - view segmentation tasks, verifying the effectiveness of SAM as a supplement to INF.

Multi-view Remote Sensing Image Segmentation With SAM priors

RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation

Integrating Spatial Prior Adapter for Enhancing SAM Performance in Medical Image Segmentation

Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation

MeSAM: Multiscale Enhanced Segment Anything Model for Optical Remote Sensing Images

SAM-RSIS: Progressively Adapting SAM With Box Prompting to Remote Sensing Image Instance Segmentation

RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

Implicit Ray-Transformers for Multi-view Remote Sensing Image Segmentation

SAM-Assisted Remote Sensing Imagery Semantic Segmentation with Object and Boundary Constraints

Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Experts

The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot

PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images

Self-guided Few-shot Semantic Segmentation for Remote Sensing Imagery Based on Large Vision Models

A Synergistical Attention Model for Semantic Segmentation of Remote Sensing Images

Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models

Multi-View Feature Fusion and Rich Information Refinement Network for Semantic Segmentation of Remote Sensing Images

Segment Anything with Multiple Modalities

A Multispectral Remote Sensing Crop Segmentation Method Based on Segment Anything Model Using Multistage Adaptation Fine-Tuning

Segment Anything without Supervision