Robust Cross-modal Medical Image Translation Via Diffusion Model and Knowledge Distillation

Yuehan Xia,Saifeng Feng,Jianhui Zhao,Zhiyong Yuan
DOI: https://doi.org/10.1109/ijcnn60899.2024.10650498
2024-01-01
Abstract:Medical image translation holds significant value, but its difficulty is amplified due to variations in noise patterns and the requisite anatomical invariance of image content. Various deep learning approaches, such as mainstream Generative Adversarial Networks (GANs), have been developed to learn multimodal mappings for obtaining translated images. However, the results produced by generators remain far from perfect for medical images, given the challenging requirements of style variations in noise patterns and anatomical invariance. In this paper, a medical image translation framework is proposed based on a diffusion model and knowledge distillation. To enhance the robustness of adversarial training and the accuracy of generated images, unlike traditional GANs, this framework incorporates an adaptive forward diffusion module for data augmentation following the generator. Additionally, the discriminator is designed as a timestep-dependent discriminator. Both real and generated images undergo the same forward diffusion process, and the discriminator learns to discriminate between real and generated images at each time step. Finally, an additional refinement network is composed of structurally similar but differently inputted teacher and student modules. Unlike existing knowledge distillation approaches, our teacher module is designed as a registration network with more inputs to better learn noise distribution and further refine translation results during training. Subsequently, knowledge is thoroughly distilled into the student module to ensure the generation of superior translation results. Extensive experiments on two public medical image datasets, along with comparisons with SOTA methods, demonstrate that the model produces higher quality and more robust images.
What problem does this paper attempt to address?