Robust automated detection of microstructural white matter degeneration in Alzheimer's disease using machine learning classification of multicenter DTI data
Martin Dyrba,Michael Ewers,Martin Wegrzyn,Ingo Kilimann,Claudia Plant,Annahita Oswald,Thomas Meindl,Michela Pievani,Arun L W Bokde,Andreas Fellgiebel,Massimo Filippi,Harald Hampel,Stefan Klöppel,Karlheinz Hauenstein,Thomas Kirste,Stefan J Teipel,EDSD study group,Federica Agosta,Frederik Barkhof,Janusch Blautzik,Florian Fischer,Giovanni B Frisoni,Lutz Frolich,Lukrezia Hausner,Frank Hentschel,Michael Hull,Frank Jessen,Vanja Kljajevic,Stefan Kloppel,Laurence O'Dwyer,Petra J W Pouwels
DOI: https://doi.org/10.1371/journal.pone.0064925
IF: 3.7
2013-05-31
PLoS ONE
Abstract:Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer's disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was not part of the training sample.