Abstract:Dementia is characterized by a decline in memory and thinking that is significant enough to impair function in activities of daily living. Patients seen in dementia specialty clinics are highly heterogeneous with a variety of different symptoms that progress at different rates. In this work, we used an unsupervised data-driven K-Means clustering approach on the component scores of the Clinical Dementia Rating (CDR) score to identify dementia subtypes and used the gap-statistic to identify the optimal number of clusters. Our goal was to characterize the identified dementia subtypes in terms of their cognitive performance and analyze how patient transitions between subtypes relate to disease progression. Our results indicate both inter-subtype variability, which indicates the variability amongst dementia subtypes for a particular component score even with the same CDR and (ii) intra-subtype variability, which indicates the variation in the 6 component scores within a particular dementia subtype. We observed that dementia subtypes that represented individuals with very mild dementia (CDR 0.5) had widely varying rates of transition to other subtypes. Future work includes testing the generalizability of our proposed pipeline on additional datasets, and using a larger volume of EHR data to estimate probabilistic estimates of the variability between dementia subtypes both in terms of cognitive profile and disease progression.

Identifying Dementia Subtypes with Electronic Health Records