Cross-site harmonization of diffusion MRI data without matched training subjects

Alberto De Luca,Tine Swartenbroekx,Harro Seelaar,John C van Swieten,Suheyla Cetin Karayumak,Yogesh Rathi,Ofer Pasternak,Lize C Jiskoot,Alexander Leemans
DOI: https://doi.org/10.1101/2024.05.01.591994
2024-06-27
Abstract:Diffusion MRI (dMRI) data typically suffer of marked cross-site variability, which prevents naively performing pooled analyses. To attenuate cross-site variability, harmonization methods such as the rotational invariant spherical harmonics (RISH) have been introduced to harmonize the dMRI data at the signal level. A common requirement of the RISH method, is the availability of healthy individuals who are matched at the group level in terms of multiple demographics including age, sex etc. to learn harmonization features across sites while minimizing the impact of biological variabilities in this process. However, these subjects may not always be readily available when considering multiple independent cohorts with different population characteristics, particularly retrospectively. To overcome these challenges, in this work we present a new approach to RISH harmonization that learns harmonization features while controlling for potential covariates using a voxel-based generalized linear model (RISH-GLM). By design, RISH-GLM allows to harmonize simultaneously data from any number of sites while also accounting for covariates of interest, thus not requiring matched training subjects. Additionally, RISH-GLM can harmonize data from multiple sites in a single step, whereas RISH is performed for each site independently. To demonstrate RISH-GLM, we considered data of training subjects from retrospective cohorts acquired with 3 different scanners. We performed 3 harmonization experiments of increasing complexity. First, we aimed to demonstrate that RISH-GLM is equivalent to conventional RISH when trained with data of matched training subjects. Secondly, we aimed to demonstrate that RISH-GLM can effectively learn harmonization with two groups of highly unmatched subjects. Thirdly, we evaluated the ability of RISH-GLM to simultaneously harmonized data from 3 different sites. Our results demonstrate that RISH-GLM can learn cross-site harmonization both from matched and unmatched groups of training subjects, and can effectively be used to harmonize data of multiple sites in one single step.
Bioinformatics
What problem does this paper attempt to address?