Assess the Documentation of Cognitive Tests and Biomarkers in Electronic Health Records via Natural Language Processing for Alzheimer's Disease and Related Dementias

Zhaoyi Chen,Hansi Zhang,Xi Yang,Songzi Wu,Xing He,Jie Xu,Jingchuan Guo,Mattia Prosperi,Fei Wang,Hua Xu,Yong Chen,Hui Hu,Steven T DeKosky,Matthew Farrer,Yi Guo,Yonghui Wu,Jiang Bian
DOI: https://doi.org/10.1016/j.ijmedinf.2022.104973
IF: 4.73
2022-12-23
International Journal of Medical Informatics
Abstract:Background Cognitive tests and biomarkers are the key information to assess the severity and track the progression of Alzheimer's' disease (AD) and AD-related dementias (AD/ADRD), yet, both are often only documented in clinical narratives of patients' electronic health records (EHRs). In this work, we aim to (1) assess the documentation of cognitive tests and biomarkers in EHRs that can be used as real-world endpoints, and (2) identify, extract, and harmonize the different commonly used cognitive tests from clinical narratives using natural language processing (NLP) methods into categorical AD/ADRD severity. Methods We developed a rule-based NLP pipeline to extract the cognitive tests and biomarkers from clinical narratives in AD/ADRD patients' EHRs. We aggregated the extracted results to the patient level and harmonized the cognitive test scores into severity categories using cutoffs determined based on both relevant literature and domain knowledge of AD/ADRD clinicians. Results We identified an AD/ADRD cohort of 48,912 patients from the University of Florida (UF) Health system and identified 7 measurements (6 cognitive tests and 1 biomarker) that are frequently documented in our data. Our NLP pipeline achieved an overall F1-score of 0.9059 across the 7 measurements. Among the 6 cognitive tests, we were able to harmonize 4 cognitive test scores into severity categories, and the population characteristics of patients with different severity were described. We also identified several factors related to the availability of their documentation in EHRs. Conclusion This study demonstrates that our NLP pipelines can extract cognitive tests and biomarkers of AD/ADRD accurately for downstream studies. Although, the documentation of cognitive tests and biomarkers in EHRs appears to be low, RWD is still an important resource for AD/ADRD research. Nevertheless, providing standardized approach to document cognitive tests and biomarkers in EHRS are also warranted.
health care sciences & services,computer science, information systems,medical informatics
What problem does this paper attempt to address?