Abstract:Nowadays, more and more machine learning methods are applied in the medical domain. Supervised Learning methods adopted in classification, prediction, and segmentation tasks for medical images always experience decreased performance when the training and testing datasets do not follow the i.i.d(independent and identically distributed) assumption. These distribution shift situations seriously influence machine learning applications’ robustness, fairness, and trustworthiness in the medical domain. Hence, in this paper, we adopt the CycleGAN(Generative Adversarial Networks) method to cycle train the CT(Computer Tomography) data from different scanners/manufacturers, which aims to eliminate the distribution shift from diverse data terminals, on the basis of our previous work[14]. However, due to the model collapse problem and generative mechanisms of the GAN-based model, the images we generated contained serious artifacts. To remove the boundary marks and artifacts, we adopt score-based diffusion generative models to refine the images voxel-wisely. This innovative combination of two generative models enhances the quality of data providers while maintaining significant features. Meanwhile, we use five paired patients’ medical images to deal with the evaluation experiments with SSIM(structural similarity index measure) metrics and the segmentation model’s performance comparison. We conclude that CycleGAN can be utilized as an efficient data augmentation technique rather than a distribution-shift-eliminating method. While the denoising diffusion model is more suitable for dealing with the distribution shift problem aroused by the different terminal modules. In addition, another limitation of generative methods applied in medical images is the difficulty in obtaining large and diverse datasets that accurately capture the complexity of biological structure and variability. In future works, we will evaluate the original and generative datasets by experimenting with a broader range of supervised methods. We will implement the generative methods under the federated learning architecture, which can preserve their benefits and eliminate the distribution shift problem in a broader range.

A Method of Dividing Clinical Data Set for Medical Image AI Training

Data Splitting Method of Distance Metric Learning Based on Gaussian Mixed Model

A Survey on Domain Generalization for Medical Image Analysis

Mix-and-Interpolate: A Training Strategy to Deal with Source-Biased Medical Data

Distributed deep learning networks among institutions for medical imaging

Medical College Education Data Analysis Method Based on Improved Deep Learning Algorithm

Improving the Classification Effect of Clinical Images of Diseases for Multi-Source Privacy Protection

Voxel-wise Medical Images Generalization for Eliminating Distribution Shift

SplitAVG: A Heterogeneity-Aware Federated Deep Learning Method for Medical Imaging

Dataset Distillation in Medical Imaging: A Feasibility Study

Navigating Distribution Shifts in Medical Image Analysis: A Survey

Treasure in Distribution: A Domain Randomization based Multi-Source Domain Generalization for 2D Medical Image Segmentation

Target area distillation and section attention segmentation network for accurate 3D medical image segmentation

Institutionally Distributed Deep Learning Networks

Intelligent medical heterogeneous big data set balanced clustering using deep learning

Out-of-distribution Detection in Medical Image Analysis: A survey

A GMM Based Algorithm To Generate Point-Cloud And Its Application To Neuroimaging

Decoupling Predictions in Distributed Learning for Multi-Center Left Atrial MRI Segmentation

Big data analysis for Covid-19 in hospital information systems

Embracing Massive Medical Data

A differentially private distributed collaborative XGBoost method