Brain age prediction in schizophrenia: does the choice of machine learning algorithm matter?

Won Hee Lee,Mathilde Antoniades,Hugo G Schnack,Rene S. Kahn,Sophia Frangou
DOI: https://doi.org/10.1101/2020.07.28.224931
2020-07-29
Abstract:Abstract Background Schizophrenia has been associated with lifelong deviations in the normative trajectories of brain structure. These deviations can be captured using the brain-predicted age difference (brainPAD), which is the difference between the biological age of an individual’s brain, as inferred from neuroimaging data, and their chronological age. Various machine learning algorithms are currently used for this purpose but their comparative performance has yet to be systematically evaluated. Methods Six linear regression algorithms, ordinary least squares (OLS) regression, ridge regression, least absolute shrinkage and selection operator (Lasso) regression, elastic-net regression, linear support vector regression (SVR), and relevance vector regression (RVR), were applied to brain structural data acquired on the same 3T scanner using identical sequences from patients with schizophrenia (n=90) and healthy individuals (n=200). The performance of each algorithm was quantified by the mean absolute error (MAE) and the correlation (R) between predicted brain-age and chronological age. The inter-algorithm similarity in predicted brain-age, brain regional regression weights and brainPAD were compared using correlation analyses and hierarchical clustering. Results In patients with schizophrenia, ridge regression, Lasso regression, elastic-net regression, and RVR performed very similarly and showed a high degree of correlation in predicted brain-age (R>0.94) and brain regional regression weights (R>0.66). By contrast, OLS regression, which was the only algorithm without a penalty term, performed markedly worse and showed a lower similarity with the other algorithms. The mean brainPAD was higher in patients than in healthy individuals but varied by algorithm from 3.8 to 5.2 years although all analyses were performed on the same dataset. Conclusions Linear machine learning algorithms, with the exception of OLS regression, have comparable performance for age prediction on the basis of a combination of cortical and subcortical structural measures. However, algorithm choice introduced variation in brainPAD estimation, and therefore represents an important source of inter-study variability.
What problem does this paper attempt to address?