Abstract:Medical radiography segmentation, and specifically dental radiography, is highly limited by the cost of labeling which requires specific expertise and labor-intensive annotations. In this work, we propose a straightforward pre-training method for semantic segmentation leveraging Denoising Diffusion Probabilistic Models (DDPM), which have shown impressive results for generative modeling. Our straightforward approach achieves remarkable performance in terms of label efficiency and does not require architectural modifications between pre-training and downstream tasks. We propose to first pre-train a Unet by exploiting the DDPM training objective, and then fine-tune the resulting model on a segmentation task. Our experimental results on the segmentation of dental radiographs demonstrate that the proposed method is competitive with state-of-the-art pre-training methods.

What problem does this paper attempt to address?

The problem this paper attempts to address is the high cost and need for expertise in annotating data for dental radiographic image segmentation tasks, which limits the application of deep learning methods. Specifically, the authors propose a self-supervised pre-training method based on Denoising Diffusion Probabilistic Models (DDPM) to improve label efficiency and achieve performance comparable to or better than existing pre-training methods with a small amount of annotated data. ### Background of the Paper - **Problem Background**: Automatic semantic segmentation of dental radiographic images is of great significance for clinical practice, as it can assist doctors in quickly and accurately identifying anatomical and pathological elements. However, while deep learning methods perform robustly in segmentation tasks, they require a large amount of pixel-level annotated data, which is time-consuming to obtain and requires professional medical knowledge. - **Existing Solutions**: Many recent methods have adopted self-supervised learning as a pre-training step to reduce the need for annotated data, such as using Generative Adversarial Networks (GANs) or contrastive learning methods (e.g., MoCo v2) for pre-training. ### Contributions of the Paper - **Method Overview**: The authors propose a method called "Pre-Training with Diffusion models for Dental Radiography segmentation" (PTDR), which includes two steps: 1. **Pre-training Phase**: Using DDPM for self-supervised pre-training on a large amount of unlabeled dental radiographic images. 2. **Fine-tuning Phase**: Fine-tuning the pre-trained model on a small amount of annotated data to complete the semantic segmentation task. - **Innovations**: - **Simplicity**: The entire Unet architecture is trained in one go during the pre-training phase, without the need for complex feature extraction or additional classifiers as in other methods. - **Efficiency**: The PTDR method outperforms existing pre-training methods with a small amount of annotated data. - **Flexibility**: Both the pre-training and inference phases require only one forward pass, with the time step fixed to a predetermined value, simplifying the training and inference process. ### Experimental Results - **Experimental Setup**: The authors conducted experiments on a dental bitewing radiographic image dataset, which includes 2500 unlabeled images and 100 labeled images. The labeled data was randomly divided into training, validation, and test sets. - **Performance Evaluation**: The mean Intersection over Union (mIoU) was used as the evaluation metric. - **Results**: - With 10 annotated samples, the PTDR method achieved an mIoU of 76.96%, significantly outperforming other pre-training methods. - The PTDR method showed better performance with different numbers of annotated samples (1, 2, 5, 10). - The impact of pre-training iterations: The pre-training effect was optimal between 10k and 50k iterations, after which the performance tended to saturate. ### Conclusion - **Main Conclusion**: The PTDR method, by leveraging DDPM for self-supervised pre-training, can achieve efficient semantic segmentation of dental radiographic images with a small amount of annotated data. This method is not only simple and easy to use but also performs well on multiple datasets and tasks. - **Future Work**: The authors plan to apply this method to other types of medical imaging datasets to further verify its generalization ability.

Pre-Training with Diffusion models for Dental Radiography segmentation

Accelerating Diffusion Models Via Pre-segmentation Diffusion Sampling for Medical Image Segmentation

Denoising Diffusion Medical Models

A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models

Paired Diffusion: Generation of related, synthetic PET-CT-Segmentation scans using Linked Denoising Diffusion Probabilistic Models

Analysing Diffusion Segmentation for Medical Images

Importance of Aligning Training Strategy with Evaluation for Diffusion Models in 3D Multiclass Segmentation

Denoising Diffusions in Latent Space for Medical Image Segmentation

Hybrid diffusion models: combining supervised and generative pretraining for label-efficient fine-tuning of segmentation models

Denoising diffusion probabilistic models for generation of realistic fully-annotated microscopy image datasets

Memory-Efficient 3D Denoising Diffusion Models for Medical Image Processing

DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training

Accelerating denoising diffusion probabilistic model via truncated inverse processes for medical image segmentation

Enhancing Label-efficient Medical Image Segmentation with Text-guided Diffusion Models

Denoising Diffusion Semantic Segmentation with Mask Prior Modeling

Denoising Diffusion Probabilistic Models for Generation of Realistic Fully-Annotated Microscopy Image Data Sets

Diffusion Models for Memory-efficient Processing of 3D Medical Images

Label-Efficient Semantic Segmentation with Diffusion Models

Introducing Shape Prior Module in Diffusion Model for Medical Image Segmentation

Enhanced segmentation of femoral bone metastasis in CT scans of patients using synthetic data generation with 3D diffusion models

PET image denoising based on denoising diffusion probabilistic models