Abstract:In this paper, part of the DREAMING Challenge - Diminished Reality for Emerging Applications in Medicine through Inpainting, we introduce a refined video inpainting technique optimised from the ProPainter method to meet the specialised demands of medical imaging, specifically in the context of oral and maxillofacial surgery. Our enhanced algorithm employs the zero-shot ProPainter, featuring optimized parameters and pre-processing, to adeptly manage the complex task of inpainting surgical video sequences, without requiring any training process. It aims to produce temporally coherent and detail-rich reconstructions of occluded regions, facilitating clearer views of operative fields. The efficacy of our approach is evaluated using comprehensive metrics, positioning it as a significant advancement in the application of diminished reality for medical purposes.

What problem does this paper attempt to address?

This paper attempts to address the issue of key areas being obscured in oral and maxillofacial surgery videos due to the occlusion caused by surgical instruments and the surgeon's hands. Specifically, the authors propose an optimized video inpainting technique (ProPainter) to generate temporally coherent and detail-rich reconstructions of occluded areas without any training, thereby providing a clearer surgical view. ### Main Issues 1. **Occluded Area Restoration**: In surgical videos, surgical instruments and the surgeon's hands can obscure the patient's face or other important parts, requiring algorithms to remove these occlusions and restore the background of the occluded areas. 2. **Temporal Consistency**: Maintaining temporal coherence during video inpainting is crucial to avoid noticeable differences between frames. 3. **Real-time Processing**: To meet practical application needs, the inpainting algorithm must complete processing within a limited time to support real-time or near-real-time application scenarios. ### Solution - **Optimized ProPainter Model**: The authors optimized the existing ProPainter method, including parameter adjustments and preprocessing techniques, to better adapt to the specific needs of medical imaging. - **Zero-shot Learning**: Utilizing pre-trained model weights to achieve efficient video inpainting without additional training. - **Multi-domain Propagation Mechanism**: Ensuring spatial and temporal consistency through image propagation and feature propagation. - **Sparse Video Transformer**: Introducing a sparse video transformer to improve computational efficiency while maintaining the quality of the inpainting results. ### Experimental Results - **Performance Evaluation**: The model's performance was comprehensively evaluated using various metrics (such as LPIPS, FID, MAE, PSNR, etc.), and the results showed that this method outperformed existing state-of-the-art methods (such as Stable Diffusion) on multiple metrics. - **Practical Application**: In the first phase of the DREAMING challenge, this method achieved first place, demonstrating its effectiveness in practical application scenarios. ### Conclusion By optimizing the ProPainter framework, this study successfully addressed the issue of occluded area restoration in oral and maxillofacial surgery videos, providing temporally coherent and detail-rich reconstruction results, thereby offering strong support for enhancing visual clarity in medical practice.

Optimised ProPainter for Video Diminished Reality Inpainting

Single-Mask Inpainting for Voxel-Based Neural Radiance Fields

Video Inpainting of Complex Scenes

Depth-Aware Endoscopic Video Inpainting

Depth-Aided Inpainting for Disocclusion Restoration of Multi-View Images Using Depth-Image-Based Rendering

DeepDR: Deep Structure-Aware RGB-D Inpainting for Diminished Reality

Deep Interactive Video Inpainting: an Invisibility Cloak for Harry Potter.

3DPF-FBN: Video Inpainting by Jointly 3D-Patch Filling and Neural Network Refinement

Reimagining Reality: A Comprehensive Survey of Video Inpainting Techniques

Inpainting surgical occlusion from laparoscopic video sequences for robot-assisted interventions

Effective Image and Video Error Concealment using RST-Invariant Partial Patch Matching Model and Exemplar-based Inpainting

DVI: Depth Guided Video Inpainting for Autonomous Driving

Depth-guided Deep Video Inpainting

Short-Long-Term Propagation-Based Video Inpainting

ProPainter: Improving Propagation and Transformer for Video Inpainting

Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Digital Video Inpainting Based on Three-dimensional Poisson Equation

Reference-based Painterly Inpainting via Diffusion: Crossing the Wild Reference Domain Gap

I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting

SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting

Disparities Assisted Video Inpainting