Abstract:Deployment of machine learning algorithms into real-world practice is still a difficult task. One of the challenges lies in the unpredictable variability of input data, which may differ significantly among individual users, institutions, scanners, etc. The input data variability can be decreased by using suitable data preprocessing with robust data harmonization. In this paper, we present a method of image harmonization using Cumulative Distribution Function (CDF) matching based on curve fitting. This approach does not ruin local variability and individual important features. The transformation of image intensities is non-linear but still ``smooth and elastic", as compared to other known histogram matching algorithms. Non-linear transformation allows for a very good match to the template. At the same time, elasticity constraints help to preserve local variability among individual inputs, which may encode important features for subsequent machine-learning processing. The pre-defined template CDF offers a better and more intuitive control for the input data transformation compared to other methods, especially ML-based ones. Even though we demonstrate our method for MRI images, the method is generic enough to apply to other types of imaging data.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in practical applications, how to reduce the variability of MRI image data to improve the consistency and accuracy of machine - learning algorithms across different devices, protocols, and institutions. Specifically, the author proposes a method based on cumulative distribution function (CDF) matching for image harmonization, aiming at: 1. **Reducing the variability of input data**: Due to differences among different scanners, protocols, or institutions, MRI image data has significant variability, which poses challenges to accurate image interpretation and automatic processing. Especially in the federated learning environment, a single model needs to adapt to very different (and usually non - independent and identically distributed) image sets. 2. **Preserving local features**: When performing image intensity conversion, ensure that local features and important information in the image are not destroyed. These features are crucial for subsequent machine - learning processing (such as brain tumor segmentation or classification). 3. **Providing intuitive control**: Through pre - defined template CDFs, provide a more intuitive way to control the conversion of input data, which is simpler and more straightforward compared to other methods (especially machine - learning - based methods). 4. **Applicable to multiple imaging types**: Although this method is mainly applied to MRI images in this paper, its generality is sufficient to be extended to other types of imaging data. ### Method Overview The method proposed by the author is based on CDF matching and is achieved through curve - fitting optimization. Specific steps include: - **Calculating the CDF of the original image**. - **Adjusting the original CDF to the template CDF using curve - fitting**, allowing for double - scale scaling and translation transformations. - **Applying tail compression** to ensure that intensity values are within a specified range. - **Finally, transforming the image intensity**, making adjustments according to the lookup table (LUT) generated in the above steps. This method can not only effectively reduce the variability of MRI images but also enhance image contrast, thereby improving the performance of subsequent machine - learning tasks. ### Experimental Results This method has been extensively tested on brain MRI images, and the results show that: - After using this method, the accuracy of the Brain MRI Screening Tool has been significantly improved. - Compared with the simple percentile stretching method, it performs better in multiple metrics such as the Dice coefficient, global Dice coefficient, sensitivity, and precision. ### Conclusion The image harmonization method based on CDF matching proposed by the author can effectively reduce variability while preserving local image features, is applicable to multiple imaging types, and is easy to integrate into existing projects. The Python implementation code has been made public for the convenience of researchers and engineers.

Image Harmonization using Robust Restricted CDF Matching

High-Resolution Image Harmonization via Collaborative Dual Transformations

Harmonization Across Imaging Locations(HAIL): One-Shot Learning for Brain MRI

Cross-Vendor CT Image Data Harmonization Using CVH-CT

Segmentation-Renormalized Deep Feature Modulation for Unpaired Image Harmonization

Unpaired Volumetric Harmonization of Brain MRI with Conditional Latent Diffusion

Robust Frame-to-Frame Hybrid Matching

Fast and Robust Matching for Multimodal Remote Sensing Image Registration

Embracing the disharmony in medical imaging: A Simple and effective framework for domain adaptation

Information-based Disentangled Representation Learning for Unsupervised MR Harmonization

Image Matching with Scale Adjustment

Hierarchical Dynamic Image Harmonization

All you need is Data Preparation: A Systematic Review of Image Harmonization Techniques in Multi-center/device Studies for Medical Support Systems

Multimodal image matching: A scale-invariant algorithm and an open dataset

Heterogeneous image transformation

A Deep Learning Harmonization of Multi-Vendor MRI for Robust Intervertebral Disc Segmentation

Matching and Homogenizing Convolution Kernels for Quantitative Studies in Computed Tomography

Robust Heterogeneous Model Fitting for Multi-source Image Correspondences

Nonlinear Intensity, Scale and Rotation Invariant Matching for Multimodal Images

Breaking Through the Noisy Correspondence: A Robust Model for Image-Text Matching