BerDiff: Conditional Bernoulli Diffusion Model for Medical Image Segmentation

Tao Chen,Chenhui Wang,Hongming Shan
DOI: https://doi.org/10.1007/978-3-031-43901-8_47
2023-04-10
Abstract:Medical image segmentation is a challenging task with inherent ambiguity and high uncertainty, attributed to factors such as unclear tumor boundaries and multiple plausible annotations. The accuracy and diversity of segmentation masks are both crucial for providing valuable references to radiologists in clinical practice. While existing diffusion models have shown strong capacities in various visual generation tasks, it is still challenging to deal with discrete masks in segmentation. To achieve accurate and diverse medical image segmentation masks, we propose a novel conditional Bernoulli Diffusion model for medical image segmentation (BerDiff). Instead of using the Gaussian noise, we first propose to use the Bernoulli noise as the diffusion kernel to enhance the capacity of the diffusion model for binary segmentation tasks, resulting in more accurate segmentation masks. Second, by leveraging the stochastic nature of the diffusion model, our BerDiff randomly samples the initial Bernoulli noise and intermediate latent variables multiple times to produce a range of diverse segmentation masks, which can highlight salient regions of interest that can serve as valuable references for radiologists. In addition, our BerDiff can efficiently sample sub-sequences from the overall trajectory of the reverse diffusion, thereby speeding up the segmentation process. Extensive experimental results on two medical image segmentation datasets with different modalities demonstrate that our BerDiff outperforms other recently published state-of-the-art methods. Our results suggest diffusion models could serve as a strong backbone for medical image segmentation.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the inherent ambiguity and high uncertainty in medical image segmentation. Specifically, due to factors such as unclear tumor boundaries and multiple possible annotations in lung nodule CT images, existing medical image segmentation methods may lead to misdiagnosis or suboptimal treatment plans when providing a single, deterministic most likely hypothesis. Therefore, providing accurate and diverse segmentation masks as valuable references for radiologists is crucial in clinical practice. To address these issues, the authors propose a novel conditional Bernoulli diffusion model (BerDiff) for medical image segmentation. Unlike traditional diffusion models based on Gaussian noise, BerDiff uses Bernoulli noise as the diffusion kernel to enhance the model's ability to handle binary segmentation tasks, thereby generating more accurate segmentation masks. Additionally, by leveraging the stochasticity of the diffusion model, BerDiff can randomly sample initial Bernoulli noise and intermediate latent variables multiple times to generate a series of diverse segmentation masks. These masks can highlight significant regions of interest, providing valuable information for radiologists. Furthermore, BerDiff can efficiently sample subsequences from the overall reverse diffusion trajectory, thereby accelerating the segmentation process. The main contributions of the paper include: 1. Proposing a new conditional diffusion model based on Bernoulli noise for discrete binary segmentation tasks, achieving accurate and diverse medical image segmentation masks. 2. BerDiff can efficiently sample subsequences from the overall reverse diffusion trajectory, thus speeding up the segmentation process. 3. Experimental results show that BerDiff outperforms other state-of-the-art methods on two different medical image segmentation datasets (CT and MRI), particularly on the LIDC-IDRI and BRATS 2021 datasets.