Point-SAM: Promptable 3D Segmentation Model for Point Clouds

Yuchen Zhou,Jiayuan Gu,Tung Yen Chiang,Fanbo Xiang,Hao Su

2024-06-26

Abstract:The development of 2D foundation models for image segmentation has been significantly advanced by the Segment Anything Model (SAM). However, achieving similar success in 3D models remains a challenge due to issues such as non-unified data formats, lightweight models, and the scarcity of labeled data with diverse masks. To this end, we propose a 3D promptable segmentation model (Point-SAM) focusing on point clouds. Our approach utilizes a transformer-based method, extending SAM to the 3D domain. We leverage part-level and object-level annotations and introduce a data engine to generate pseudo labels from SAM, thereby distilling 2D knowledge into our 3D model. Our model outperforms state-of-the-art models on several indoor and outdoor benchmarks and demonstrates a variety of applications, such as 3D annotation. Codes and demo can be found at <a class="link-external link-https" href="https://github.com/zyc00/Point-SAM" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition,Artificial Intelligence

What problem does this paper attempt to address?

The paper aims to address the problem of 3D point cloud segmentation and proposes a new model called Point-SAM. Specifically: - **Objective**: To establish a 3D promptable segmentation model for point clouds as a foundational step in building a 3D foundational model. This model can uniformly handle point cloud data from different data sources and predict effective segmentation masks. - **Challenges**: In the 3D domain, compared to 2D image segmentation, there are issues such as non-uniform data formats, a lack of lightweight models, and scarce annotated data. Additionally, existing attempts are limited to extending 2D image results to 3D space, which is affected by factors like image quality and viewpoint selection, and ensuring multi-view consistency is challenging. - **Method**: The authors propose Point-SAM, a model based on the Transformer architecture that can process input point cloud data and generate segmentation results through point prompts and mask prompts. To expand the training dataset, they developed a data engine to generate pseudo-labels, using SAM to generate initial diverse mask proposals and iteratively refining these proposals. - **Contributions**: These include the development of Point-SAM, a 3D foundational model for point clouds; the proposal of a data engine to generate pseudo-labels with a large number of diverse masks; and the successful extension of the model and dataset for 3D segmentation experiments, demonstrating the model's zero-shot transfer capability on unseen point cloud distributions.

Point-SAM: Promptable 3D Segmentation Model for Point Clouds

Pass3d: Precise And Accelerated Semantic Segmentation For 3d Point Cloud

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation

When 3D Partial Points Meets SAM: Tooth Point Cloud Segmentation with Sparse Labels

SAM3D: Segment Anything in 3D Scenes

PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images

Superpoint-guided Semi-supervised Semantic Segmentation of 3D Point Clouds

3D Object Segmentation Using Cross-Window Point Transformer with Latent Semantic Boundary Guidance

Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation

Segment Anything in 3D with Radiance Fields

When 3D Bounding-Box Meets SAM: Point Cloud Instance Segmentation with Weak-and-Noisy Supervision

SAMPart3D: Segment Any Part in 3D Objects

A Point Cloud Segmentation Method for Dim and Cluttered Underground Tunnel Scenes Based on the Segment Anything Model

Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans

PA-SAM: Prompt Adapter SAM for High-Quality Image Segmentation

PropSAM: A Propagation-Based Model for Segmenting Any 3D Objects in Multi-Modal Medical Images

SAI3D: Segment Any Instance in 3D Scenes

SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images

PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models

AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model