Class Balancing Diversity Multimodal Ensemble for Alzheimer's Disease Diagnosis and Early Detection

Arianna Francesconi,Lazzaro di Biase,Donato Cappetta,Fabio Rebecchi,Paolo Soda,Rosa Sicilia,Valerio Guarrasi
2024-10-14
Abstract:Alzheimer's disease (AD) poses significant global health challenges due to its increasing prevalence and associated societal costs. Early detection and diagnosis of AD are critical for delaying progression and improving patient outcomes. Traditional diagnostic methods and single-modality data often fall short in identifying early-stage AD and distinguishing it from Mild Cognitive Impairment (MCI). This study addresses these challenges by introducing a novel approach: multImodal enseMble via class BALancing diversity for iMbalancEd Data (IMBALMED). IMBALMED integrates multimodal data from the Alzheimer's Disease Neuroimaging Initiative database, including clinical assessments, neuroimaging phenotypes, biospecimen and subject characteristics data. It employs an ensemble of model classifiers, each trained with different class balancing techniques, to overcome class imbalance and enhance model accuracy. We evaluate IMBALMED on two diagnostic tasks (binary and ternary classification) and four binary early detection tasks (at 12, 24, 36, and 48 months), comparing its performance with state-of-the-art algorithms and an unbalanced dataset method. IMBALMED demonstrates superior diagnostic accuracy and predictive performance in both binary and ternary classification tasks, significantly improving early detection of MCI at 48-month time point. The method shows improved classification performance and robustness, offering a promising solution for early detection and management of AD.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper aims to address two main challenges in the early detection and diagnosis of Alzheimer's disease (AD): 1. **Integration of multimodal data**: Traditional diagnostic methods usually rely on a single data source (i.e., unimodal data), such as neuropsychological tests or specific biomarkers. These methods have limitations in identifying early - stage AD and differentiating mild cognitive impairment (MCI). The paper proposes a new method - Multimodal Ensemble. By integrating data from different modalities (including clinical evaluations, biological sample data, subject characteristics, and neuroimaging phenotypes), it can provide more comprehensive information, thereby improving the diagnostic accuracy of AD. 2. **Class imbalance problem**: In the early detection of AD, the number of patients who convert from MCI to AD is far less than those who do not convert, resulting in class imbalance in the dataset. This imbalance may cause machine - learning models to be biased towards the majority class (usually cognitively normal, CN), and have a weaker ability to recognize the minority class (potential AD or MCI cases). The paper overcomes the class imbalance problem and enhances the model's accuracy by introducing a new method - Multimodal Ensemble for Imbalanced Data through Class - Balanced Diversity (IMBALMED), which trains multiple model classifiers using different class - balancing techniques. Specifically, the IMBALMED method is implemented through the following steps: - **Data balancing**: For each data modality, different balanced subsets are created, each with a different level of class representation. This step introduces diversity by adjusting the percentage of representation of each class to ensure that each class is adequately represented. - **Model training**: Each balanced subset is trained using a classifier to generate multiple "model experts". - **Fusion stage**: In the test phase, the class - membership probabilities of the input samples are calculated through two steps: unimodal fusion and multimodal fusion. First, the probability distributions within each modality are combined to generate unimodal probability vectors; then, these unimodal probability vectors are averaged across different modalities, and finally an output vector is obtained, representing the probabilities that the sample belongs to each class. Through this method, IMBALMED not only improves the robustness and performance of the model, but also provides a promising solution for the early detection and management of AD.