TransDiff: medical image segmentation method based on Swin Transformer with diffusion probabilistic model

Xiaoxiao Liu,Yan Zhao,Shigang Wang,Jian Wei
DOI: https://doi.org/10.1007/s10489-024-05496-w
IF: 5.3
2024-05-19
Applied Intelligence
Abstract:Medical image segmentation can provide a reliable basis for clinical analysis and diagnosis. However, this task is challenging due to the low contrast, boundary ambiguity between organs or lesions and surrounding tissues, and noise interference of images. To address this challenge, which is unique to medical images, and further improve the segmentation accuracy and precision, a medical image segmentation model (TransDiff) is proposed from the perspective of improving model robustness and enriching semantic information. TransDiff comprises three parts: a variational autoencoder (VAE), a diffusion transformer model and a Swin Transformer. The VAE constructs a latent space to provide an environment for fully extracting and fusing features. The diffusion model predicts and removes noise by inferring semantics through the propagation of information between nodes. The Swin Transformer enriches discriminative features as a conditional part. TransDiff inherits the robustness to noise and missing data of the diffusion model and the feature enrichment of the Swin Transformer, thus exhibiting a higher understanding of semantic information. It performs well on medical datasets with three different image modalities, outperforms existing medical image segmentation methods in terms of segmentation precision and accuracy, and has good generalizability. The codes and trained models will be publicly available at https://github.com/xiaoxiao1997/TransDiff.
computer science, artificial intelligence
What problem does this paper attempt to address?