Pre-Training with Diffusion models for Dental Radiography segmentation

Jérémy Rousseau,Christian Alaka,Emma Covili,Hippolyte Mayard,Laura Misrachi,Willy Au
2023-07-27
Abstract:Medical radiography segmentation, and specifically dental radiography, is highly limited by the cost of labeling which requires specific expertise and labor-intensive annotations. In this work, we propose a straightforward pre-training method for semantic segmentation leveraging Denoising Diffusion Probabilistic Models (DDPM), which have shown impressive results for generative modeling. Our straightforward approach achieves remarkable performance in terms of label efficiency and does not require architectural modifications between pre-training and downstream tasks. We propose to first pre-train a Unet by exploiting the DDPM training objective, and then fine-tune the resulting model on a segmentation task. Our experimental results on the segmentation of dental radiographs demonstrate that the proposed method is competitive with state-of-the-art pre-training methods.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is the high cost and need for expertise in annotating data for dental radiographic image segmentation tasks, which limits the application of deep learning methods. Specifically, the authors propose a self-supervised pre-training method based on Denoising Diffusion Probabilistic Models (DDPM) to improve label efficiency and achieve performance comparable to or better than existing pre-training methods with a small amount of annotated data. ### Background of the Paper - **Problem Background**: Automatic semantic segmentation of dental radiographic images is of great significance for clinical practice, as it can assist doctors in quickly and accurately identifying anatomical and pathological elements. However, while deep learning methods perform robustly in segmentation tasks, they require a large amount of pixel-level annotated data, which is time-consuming to obtain and requires professional medical knowledge. - **Existing Solutions**: Many recent methods have adopted self-supervised learning as a pre-training step to reduce the need for annotated data, such as using Generative Adversarial Networks (GANs) or contrastive learning methods (e.g., MoCo v2) for pre-training. ### Contributions of the Paper - **Method Overview**: The authors propose a method called "Pre-Training with Diffusion models for Dental Radiography segmentation" (PTDR), which includes two steps: 1. **Pre-training Phase**: Using DDPM for self-supervised pre-training on a large amount of unlabeled dental radiographic images. 2. **Fine-tuning Phase**: Fine-tuning the pre-trained model on a small amount of annotated data to complete the semantic segmentation task. - **Innovations**: - **Simplicity**: The entire Unet architecture is trained in one go during the pre-training phase, without the need for complex feature extraction or additional classifiers as in other methods. - **Efficiency**: The PTDR method outperforms existing pre-training methods with a small amount of annotated data. - **Flexibility**: Both the pre-training and inference phases require only one forward pass, with the time step fixed to a predetermined value, simplifying the training and inference process. ### Experimental Results - **Experimental Setup**: The authors conducted experiments on a dental bitewing radiographic image dataset, which includes 2500 unlabeled images and 100 labeled images. The labeled data was randomly divided into training, validation, and test sets. - **Performance Evaluation**: The mean Intersection over Union (mIoU) was used as the evaluation metric. - **Results**: - With 10 annotated samples, the PTDR method achieved an mIoU of 76.96%, significantly outperforming other pre-training methods. - The PTDR method showed better performance with different numbers of annotated samples (1, 2, 5, 10). - The impact of pre-training iterations: The pre-training effect was optimal between 10k and 50k iterations, after which the performance tended to saturate. ### Conclusion - **Main Conclusion**: The PTDR method, by leveraging DDPM for self-supervised pre-training, can achieve efficient semantic segmentation of dental radiographic images with a small amount of annotated data. This method is not only simple and easy to use but also performs well on multiple datasets and tasks. - **Future Work**: The authors plan to apply this method to other types of medical imaging datasets to further verify its generalization ability.