Abstract:Segment Anything Model (SAM) is an advanced foundational model for image segmentation, widely applied to remote sensing images (RSIs). Due to the domain gap between RSIs and natural images, traditional methods typically use SAM as a source pre-trained model and fine-tune it with fully supervised masks. Unlike these methods, our work focuses on fine-tuning SAM using more convenient and challenging point annotations. Leveraging SAM's zero-shot capabilities, we adopt a self-training framework that iteratively generates pseudo-labels for training. However, if the pseudo-labels contain noisy labels, there is a risk of error accumulation. To address this issue, we extract target prototypes from the target dataset and use the Hungarian algorithm to match them with prediction prototypes, preventing the model from learning in the wrong direction. Additionally, due to the complex backgrounds and dense distribution of objects in RSI, using point prompts may result in multiple objects being recognized as one. To solve this problem, we propose a negative prompt calibration method based on the non-overlapping nature of instance masks. In brief, we use the prompts of overlapping masks as corresponding negative signals, resulting in refined masks. Combining the above methods, we propose a novel Pointly-supervised Segment Anything Model named PointSAM. We conduct experiments on RSI datasets, including WHU, HRSID, and NWPU VHR-10, and the results show that our method significantly outperforms direct testing with SAM, SAM2, and other comparison methods. Furthermore, we introduce PointSAM as a point-to-box converter and achieve encouraging results, suggesting that this method can be extended to other point-supervised tasks. The code is available at <a class="link-external link-https" href="https://github.com/Lans1ng/PointSAM" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are: How to fine - tune the Segment Anything Model (SAM) for remote sensing images (RSIs) under point - supervision conditions to improve its segmentation performance in cases of complex backgrounds and densely distributed objects. Specifically: 1. **Domain Adaptation Problem**: Due to the domain gap between remote sensing images and natural images, the direct use of the pre - trained SAM model for segmentation on remote sensing images has poor results. Traditional methods usually require fully - supervised mask annotations to fine - tune SAM, which is both time - consuming and expensive. 2. **Pseudo - label Noise Problem**: Although the self - training method can use unlabeled data to generate pseudo - labels for iterative training, the pseudo - labels may contain noise, leading to error accumulation and affecting the model performance. 3. **Polysemy Problem of Point Prompts**: In remote sensing images, using point prompts may cause multiple objects to be misidentified as one instance because points lack boundary information, especially in scenes with densely distributed objects. To solve these problems, the paper proposes the PointSAM model, with the following main contributions: - **Prototype - based Regularization (PBR)**: By extracting prototypes from the target dataset and using the Hungarian algorithm to match the predicted prototypes, the model is prevented from learning in the wrong direction, thereby improving the generalization ability of the model. - **Negative Prompt Calibration (NPC)**: Based on the prior assumption that instance masks do not overlap, the negative prompts are dynamically adjusted to reduce the confusion in segmentation results in dense scenes and improve the accuracy of the predicted masks. - **Experimental Verification**: Extensive experiments were carried out on three remote sensing image datasets (NWPU VHR - 10, WHU, HRSID), which proved the significant improvement of PointSAM under point - supervision conditions and demonstrated its potential for extended application in point - supervised object detection tasks. These methods jointly improve the segmentation performance of SAM on remote sensing images, especially under point - supervision conditions, making the fine - tuning process more efficient and less costly.

PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images

SAM-RSIS: Progressively Adapting SAM With Box Prompting to Remote Sensing Image Instance Segmentation

The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot

MeSAM: Multiscale Enhanced Segment Anything Model for Optical Remote Sensing Images

AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model

Point-SAM: Promptable 3D Segmentation Model for Point Clouds

RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation

RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation Based on Visual Foundation Model

PA-SAM: Prompt Adapter SAM for High-Quality Image Segmentation

MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps

RSPS-SAM: A Remote Sensing Image Panoptic Segmentation Method Based on SAM

Accurate, automatic zero-shot wetland mapping from high resolution remote sensing imagery by prompting large foundation model (Segment Anything Model-SAM)

Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation

Stable Segment Anything Model

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

SAM-Assisted Remote Sensing Imagery Semantic Segmentation with Object and Boundary Constraints

Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization

SAMAug: Point Prompt Augmentation for Segment Anything Model

SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation

All-in-SAM: from Weak Annotation to Pixel-wise Nuclei Segmentation with Prompt-based Finetuning