ComBat models for harmonization of resting-state EEG features in multisite studies

Alberto Jaramillo-Jimenez,Diego A Tovar-Rios,Yorguin-Jose Mantilla-Ramos,John-Fredy Ochoa-Gomez,Laura Bonanni,Kolbjørn Brønnick
DOI: https://doi.org/10.1016/j.clinph.2024.09.019
2024-09-24
Abstract:Objective: Pooling multisite resting-state electroencephalography (rsEEG) datasets may introduce bias due to batch effects (i.e., cross-site differences in the rsEEG related to scanner/sample characteristics). The Combining Batches (ComBat) models, introduced for microarray expression and adapted for neuroimaging, can control for batch effects while preserving the variability of biological covariates. We aim to evaluate four ComBat harmonization methods in a pooled sample from five independent rsEEG datasets of young and old adults. Methods: RsEEG signals (n = 374) were automatically preprocessed. Oscillatory and aperiodic rsEEG features were extracted in sensor space. Features were harmonized using neuroCombat (standard ComBat used in neuroimaging), neuroHarmonize (variant with nonlinear adjustment of covariates), OPNested-GMM (variant based on Gaussian Mixture Models to fit bimodal feature distributions), and HarmonizR (variant based on resampling to handle missing feature values). Relationships between rsEEG features and age were explored before and after harmonizing batch effects. Results: Batch effects were identified in rsEEG features. All ComBat methods reduced batch effects and features' dispersion; HarmonizR and OPNested-GMM ComBat achieved the greatest performance. Harmonized Beta power, individual Alpha peak frequency, Aperiodic exponent, and offset in posterior electrodes showed significant relations with age. All ComBat models maintained the direction of observed relationships while increasing the effect size. Conclusions: ComBat models, particularly HarmonizeR and OPNested-GMM ComBat, effectively control for batch effects in rsEEG spectral features. Significance: This workflow can be used in multisite studies to harmonize batch effects in sensor-space rsEEG spectral features while preserving biological associations.
What problem does this paper attempt to address?