3DSAM: Segment Anything in NeRF

Yan Zhang,Shangjie Wang
DOI: https://doi.org/10.1109/ICASSP48485.2024.10445897
2024-04-14
Abstract:Object segmentation within Neural Radiance Fields (NeRF) plays a pivotal role, holding potential to enrich a myriad of downstream applications like NeRF editing. Most existing methods, heavily reliant on feature similarity of 3D space, make it non-trivial to manipulate. Instead of intricate 3D interfaces, segmenting multiview images rendered from NeRF proves to be more intuitive, enhancing both visibility and interactivity. However, annotating multiple images places a heavy demand on users. To address this, we propose an interactive NeRF segmentation framework that leverages user-input from just one rendered view, automatically generating consistent prompts across all other views. Delving deeper, we propose the Semantic Prompt Generator (SPG) which employs a pre-trained SAM image encoder to extract image features. Cosine similarities between these features are then utilized to form positive-negative location pair prompts. Moreover, we propose the Position Prompt Generator (PPG) to capture geometric relationships across different views, generating consistent bounding box prompts. Our method seamlessly extends SAM’s impressive segmentation capabilities into 3D scenarios without additional network training. Extensive evaluations confirm that our algorithm not only surpasses previous works in segmentation quality but also spends less time.
Computer Science
What problem does this paper attempt to address?