Abstract:There is large interest in the early diagnosis of Alzheimer's disease (AD) using machine learning. The NIH-sponsored Alzheimer's Disease Connectome Project (ADCP), a multi-center MRI, PET, and behavioral study of brain connectivity in AD, has a specific aim of accurately staging AD throughout its progression on an individual basis. It uses state-of-the-art MRI imaging techniques which allow for building reliable machine learning models. In this ongoing project, we are training models with the MRI structural brain features to separate between healthy controls and a group of AD and mild cognitive impairment (MCI) patients. Data from 12 patients (age=70.8±6.6 years, 7 males, 4 AD patients), and 20 healthy controls (age=68.9±6.2 years, 11 males), enrolled in ADCP, were analyzed. The two groups matched in age (p=0.45) and gender ratio (p=0.85). All images were acquired with 3T GE 750 scanners. T1-weighted images were acquired using a magnetization prepared gradient echo sequence (TR/TE=604ms/2.516ms, 0.8mm isotropic). Data were pre-processed using FreeSurfer-based Human Connectome Project (HCP) processing pipelines. 269 structural features were extracted, which include cortical thicknesses, surface areas, and subcortical and global volumes. They were normalized with the individual intracranial volume and then with the standardized z-score transform. 3 traditional binary classification machine learning models were trained in Matlab: support vector machine (SVM), linear discriminant analysis (LDA), and naïve Bayes (NB) classifiers. We applied a t-test based filter selection method, where only a group of features with the largest group mean differences in the training set enters the training. For performance estimation, we used leave-one-out cross validation (LOOCV) and the area-under-the-curve (AUC). SVM model classified the two groups with 90.6% accuracy (sensitivity=83.3%, specificity=95.0%, AUC=0.78, 23 features). NB model reached 84.38% (sensitivity=83.3%, specificity=85.0%, AUC=0.83, 10 features). Bilateral temporal pole volumes and right entorhinal volume were the most discriminating features. Linear traditional machine learning models were able to separate between AD/MCI patients and healthy controls with mid-80 to 90% accuracy. This is promising as it is known that non-linear, deep learning methods will outperform these traditional models given more data in the future. Building an automated model to classify Alzheimer's patients is expected to aid early diagnosis . A t-test based filter selection method was used to let only a certain number of features with the most group difference to be used in the training. SVM reached the highest LOOCV accuracy at 90.6% using 23 features. Bilateral temporal pole volumes showed the most group differences and helped machine learning separate the two groups. The volumes are noticeably reduced in MCI and AD patients compared to the healthy controls.

Identifying and evaluating clinical subtypes of Alzheimer’s disease in care electronic health records using unsupervised machine learning

Characterizing the clinical heterogeneity of early symptomatic Alzheimer's disease: a data-driven machine learning approach

Data-driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records

Identification of Outcome-Oriented Progression Subtypes from Mild Cognitive Impairment to Alzheimer’s Disease Using Electronic Health Records

Identifying Dementia Subtypes with Electronic Health Records

Knowledge-guided Deep Temporal Clustering for Alzheimer's Disease Subtypes in Completed Clinical Trials

Discovering Alzheimer’s Disease Subtypes with Imaging and Genetic Signatures Via Multi‐view Weakly‐supervised Deep Clustering

Learning the progression and clinical subtypes of Alzheimer's disease from longitudinal clinical data

Examining heterogeneity in dementia using data-driven unsupervised clustering of cognitive profiles

Identifying frailty‐related clusters of Alzheimer’s and Vascular dementia: A multi‐modal data approach

Clustering Alzheimer's Disease Subtypes via Similarity Learning and Graph Diffusion

Temporal Subtyping of Alzheimer's Disease Using Medical Conditions Preceding Alzheimer's Disease Onset in Electronic Health Records

Leveraging multi-site electronic health data for characterization of subtypes: a pilot study of dementia in the N3C Clinical Tenant

Improving clinical efficiency in screening for cognitive impairment due to Alzheimer's

The Application of Unsupervised Clustering Methods to Alzheimer’s Disease

Interpretable Deep Clustering Survival Machines for Alzheimer’s Disease Subtype Discovery

CHARACTERIZING STRUCTURAL BRAIN ALTERATIONS IN ALZHEIMER’S DISEASE PATIENTS WITH MACHINE LEARNING

Multimodal subtypes identified in Alzheimer's Disease Neuroimaging Initiative participants by missing-data-enabled subtype and stage inference

Pathologic subtyping of Alzheimer's disease brain tissue reveals disease heterogeneity

Exploring the Genetic Heterogeneity of Alzheimer's Disease: Evidence for Genetic Subtypes

Resolving heterogeneity in Alzheimer's disease based on individualized structural covariance network