Stratification of Alzheimer's Disease Patients Using Knowledge-Guided Unsupervised Latent Factor Clustering with Electronic Health Record Data
Linshanshan Wang,Shruthi Venkatesh,Michele Morris,Mengyan Li,Ratnam Srivastava,Shyam Visweswaran,Oscar Lopez,Zongqi Xia,Tianxi Cai
DOI: https://doi.org/10.1101/2024.12.23.24319588
2024-12-26
Abstract:Background: People with Alzheimer's disease (AD) exhibit varying clinical trajectories. There is a need to predict future AD-related outcomes such as morbidity and mortality using clinical profile at the point of care.
Objective: To stratify AD patients based on baseline clinical profiles (up to two years prior to AD diagnosis) and update the model after AD diagnosis to prognosticate future AD-related outcomes.
Methods: Using the electronic health record (EHR) data of a large healthcare system (2011-2022), we first identified patients with ≥1 diagnosis code for AD or related dementia and applied a validated unsupervised phenotyping algorithm to assign AD diagnosis status. Next, we applied an unsupervised latent factor clustering approach, guided by knowledge graph embeddings of relevant EHR features up to the baseline, to cluster patients into two groups at AD diagnosis. We then prognosticated the risk of two readily ascertainable and clinically relevant AD-related outcomes (i.e., nursing home admission indicating greater need for assistance and mortality), adjusting for baseline confounders (e.g., age, gender, race, ethnicity, healthcare utilization, and comorbidities). For patients remaining at risk one year post-diagnosis, we updated their group membership and repeated the prognostication.
Results: We stratified 16,411 algorithm-identified AD patients into two groups based on their baseline clinical profiles (41% Group 1, 59% Group 2). Patients in Group 1 were marginally older at AD diagnosis (age Mean [SD]: 81.4 [9.3] vs 81.0 [8.7], p=.007), exhibited greater comorbidity burden (Elixhauser comorbidity index Mean [SD]: 11.3 [10.3] vs 7.5 [8.6], p<.0001), and more frequently received AD-related medications (47.7% vs 40.9%, p<.0001) than those in Group 2. Compared to Group 1, Group 2 had a lower risk of nursing home admission (HR [95% CI]=0.804 [0.765, 0.844], p<.001), while the two groups had similar mortality risk (HR [95% CI]=1.008 [0.963, 1.056], p=.733). One year after AD diagnosis, 12,606 patients remained at risk (45.7% Group 1, 54.3% Group 2). Consistent with baseline findings, Group 2 had a lower risk of nursing home admission than (HR [95% CI]=0.815 [0.766, 0.868], p<.001) and similar mortality risk as (HR [95% CI]=0.977 [0.922, 1.035], p=0.430) Group 1 in the updated model.
Conclusions: It is feasible to stratify patients based on readily available clinical profiles before AD diagnosis and crucially to update the model one year after diagnosis to effectively prognosticate future AD-related outcomes.