End-to-end Volumetric Segmentation of White Matter Hyperintensities using Deep Learning
Sadaf Farkhani,Naiara Demnitz,Carl-Johan Boraxbekk,Henrik Lundell,Hartwig Roman Siebner,Esben Thade Petersen,Kristoffer Hougaard Madsen
DOI: https://doi.org/10.1016/j.cmpb.2024.108008
IF: 6.1
2024-01-11
Computer Methods and Programs in Biomedicine
Abstract:Background and Objectives Reliable detection of white matter hyperintensities (WMH) is crucial for studying the impact of diffuse white-matter pathology on brain health and monitoring changes in WMH load over time. However, manual annotation of 3D high-dimensional neuroimages is laborious and can be prone to biases and errors in the annotation procedure. In this study, we evaluate the performance of deep learning (DL) segmentation tools and propose a novel volumetric segmentation model incorporating self-attention via a transformer-based architecture. Ultimately, we aim to evaluate diverse factors that influence WMH segmentation, aiming for a comprehensive analysis of the state-of-the-art algorithms in a broader context. Methods We trained state-of-the-art DL algorithms, and incorporated advanced attention mechanisms, using structural fluid-attenuated inversion recovery (FLAIR) image acquisitions. The anatomical MRI data utilized for model training was obtained from healthy individuals aged 62-70 years in the LIve active Successful Aging (LISA) project. Given the potential sparsity of lesion volume among healthy aging individuals, we explored the impact of incorporating a weighted loss function and ensemble models. To assess the generalizability of the studied DL models, we applied the trained algorithm to an independent subset of data sourced from the MICCAI WMH challenge (MWSC). Notably, this subset had vastly different acquisition parameters compared to the LISA dataset used for training. Results Consistently, DL approaches exhibited commendable segmentation performance, achieving the level of inter-rater agreement comparable to expert performance, ensuring superior quality segmentation outcomes. On the out of sample dataset, the ensemble models exhibited the most outstanding performance. Conclusions DL methods generally surpassed conventional approaches in our study. While all DL methods performed comparably, incorporating attention mechanisms could prove advantageous in future applications with a wider availability of training data. As expected, our experiments indicate that the use of ensemble-based models enables the superior generalization in out-of-distribution settings. We believe that introducing DL methods in the WHM annotation workflow in heathy aging cohorts is promising, not only for reducing the annotation time required, but also for eventually improving accuracy and robustness via incorporating the automatic segmentations in the evaluation procedure.
engineering, biomedical,computer science, interdisciplinary applications,medical informatics, theory & methods