SAMSNeRF: Segment Anything Model (SAM) Guides Dynamic Surgical Scene Reconstruction by Neural Radiance Field (NeRF)

Ange Lou,Yamin Li,Xing Yao,Yike Zhang,Jack Noble
2024-02-06
Abstract:The accurate reconstruction of surgical scenes from surgical videos is critical for various applications, including intraoperative navigation and image-guided robotic surgery automation. However, previous approaches, mainly relying on depth estimation, have limited effectiveness in reconstructing surgical scenes with moving surgical tools. To address this limitation and provide accurate 3D position prediction for surgical tools in all frames, we propose a novel approach called SAMSNeRF that combines Segment Anything Model (SAM) and Neural Radiance Field (NeRF) techniques. Our approach generates accurate segmentation masks of surgical tools using SAM, which guides the refinement of the dynamic surgical scene reconstruction by NeRF. Our experimental results on public endoscopy surgical videos demonstrate that our approach successfully reconstructs high-fidelity dynamic surgical scenes and accurately reflects the spatial information of surgical tools. Our proposed approach can significantly enhance surgical navigation and automation by providing surgeons with accurate 3D position information of surgical tools during surgery.The source code will be released soon.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main objective of this paper is to address the problem of accurately reconstructing surgical scenes from surgical videos during minimally invasive surgery. Specifically, the paper aims to overcome the limitations of existing methods (which primarily rely on depth estimation) in effectively reconstructing dynamic surgical scenes that include moving surgical tools. To tackle this issue, the authors propose a novel method called SAMSNeRF, which combines the Segment Anything Model (SAM) and Neural Radiance Field (NeRF) technologies. The specific contributions are as follows: 1. **First realization of dynamic scene reconstruction of surgical tools**: To the best of the authors' knowledge, SAMSNeRF is the first method capable of reconstructing surgical scenes that include surgical tools. 2. **Combination of SAM and NeRF**: For the first time, the powerful visual model SAM is combined with NeRF to achieve higher precision in surgical scene reconstruction. Experimental results show that SAMSNeRF can successfully reconstruct high-fidelity dynamic surgical scenes and accurately reflect the spatial information of surgical tools, thereby significantly enhancing the effectiveness of surgical navigation and automation.