Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting

Tianle Zeng,Gerardo Loza Galindo,Junlei Hu,Pietro Valdastri,Dominic Jones

2024-07-20

Abstract:Computer vision technologies markedly enhance the automation capabilities of robotic-assisted minimally invasive surgery (RAMIS) through advanced tool tracking, detection, and localization. However, the limited availability of comprehensive surgical datasets for training represents a significant challenge in this field. This research introduces a novel method that employs 3D Gaussian Splatting to generate synthetic surgical datasets. We propose a method for extracting and combining 3D Gaussian representations of surgical instruments and background operating environments, transforming and combining them to generate high-fidelity synthetic surgical scenarios. We developed a data recording system capable of acquiring images alongside tool and camera poses in a surgical scene. Using this pose data, we synthetically replicate the scene, thereby enabling direct comparisons of the synthetic image quality (29.592 PSNR). As a further validation, we compared two YOLOv5 models trained on the synthetic and real data, respectively, and assessed their performance in an unseen real-world test dataset. Comparing the performances, we observe an improvement in neural network performance, with the synthetic-trained model outperforming the real-world trained model by 12%, testing both on real-world data.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address the complexity of training and supervising computer vision techniques in Robot-Assisted Minimally Invasive Surgery (RAMIS) due to the scarcity of high-quality annotated data. Specifically, the paper proposes a novel method based on 3D Gaussian Splatting to generate synthetic surgical image datasets. This method extracts and combines 3D Gaussian representations of surgical instruments and their background environments, then transforms and fuses them to generate high-fidelity synthetic surgical scenes. Additionally, the paper introduces a data recording system capable of capturing images of the surgical scene as well as the pose information of instruments and cameras. This pose data is used to synthesize scenes, allowing for direct comparison of the quality of synthetic images. By comparing the performance of a YOLOv5 model trained on synthetic data with a model trained on real data on an unknown real-world test dataset, the study found that the model trained on synthetic data improved performance by 12%, validating the effectiveness of the proposed method. Overall, the core contribution of the paper lies in the first application of 3D Gaussian Splatting to medical image dataset generation, and in proposing a flexible and efficient method that not only generates high-quality synthetic image datasets but also automatically generates precise annotation information, thereby aiding the training of neural networks.

Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting

Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

SurgicalGS: Dynamic 3D Gaussian Splatting for Accurate Robotic-Assisted Surgical Scene Reconstruction

Lactation and reproduction.

SurgeoNet: Realtime 3D Pose Estimation of Articulated Surgical Instruments from Stereo Images using a Synthetically-trained Network

SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction

SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models

A Review of 3D Reconstruction Techniques for Deformable Tissues in Robotic Surgery

SSIS-Seg: Simulation-Supervised Image Synthesis for Surgical Instrument Segmentation

EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting

Towards markerless surgical tool and hand pose estimation

SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

Local Style Preservation in Improved GAN-Driven Synthetic Image Generation for Endoscopic Tool Segmentation

Monocular pose estimation of articulated surgical instruments in open surgery

Surgical scene generation and adversarial networks for physics-based iOCT synthesis

Navigating the Synthetic Realm: Harnessing Diffusion-based Models for Laparoscopic Text-to-Image Generation

Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting

Cross-Domain Conditional Generative Adversarial Networks for Stereoscopic Hyperrealism in Surgical Training

POV-Surgery: A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical Activities

Real-time Surgical Environment Enhancement for Robot-Assisted Minimally Invasive Surgery Based on Super-Resolution

Realistic Surgical Simulation from Monocular Videos