EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction

Yifan Liu,Chenxin Li,Chen Yang,Yixuan Yuan
2024-02-13
Abstract:Reconstructing deformable tissues from endoscopic videos is essential in many downstream surgical applications. However, existing methods suffer from slow rendering speed, greatly limiting their practical use. In this paper, we introduce EndoGaussian, a real-time endoscopic scene reconstruction framework built on 3D Gaussian Splatting (3DGS). By integrating the efficient Gaussian representation and highly-optimized rendering engine, our framework significantly boosts the rendering speed to a real-time level. To adapt 3DGS for endoscopic scenes, we propose two strategies, Holistic Gaussian Initialization (HGI) and Spatio-temporal Gaussian Tracking (SGT), to handle the non-trivial Gaussian initialization and tissue deformation problems, respectively. In HGI, we leverage recent depth estimation models to predict depth maps of input binocular/monocular image sequences, based on which pixels are re-projected and combined for holistic initialization. In SPT, we propose to model surface dynamics using a deformation field, which is composed of an efficient encoding voxel and a lightweight deformation decoder, allowing for Gaussian tracking with minor training and rendering burden. Experiments on public datasets demonstrate our efficacy against prior SOTAs in many aspects, including better rendering speed (195 FPS real-time, 100$\times$ gain), better rendering quality (37.848 PSNR), and less training overhead (within 2 min/scene), showing significant promise for intraoperative surgery applications. Code is available at: \url{
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper addresses the real-time issue in the reconstruction of endoscopy surgery scenes. Existing methods are limited in their practical applications due to slow rendering speed. The paper proposes the EndoGaussian framework, which is based on 3D Gaussian projection technique and accelerates rendering speed to real-time level through an efficient optimized rendering engine and two strategies (whole Gaussian initialization and space-time Gaussian tracking). This method can handle tissue deformation and outperforms previous SOTA methods on public datasets with faster rendering speed (195 FPS), better rendering quality (37.848 PSNR), and lower training cost.