Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study
Ayush Noori,Colin Magdamo,Xiao Liu,Tanish Tyagi,Zhaozhi Li,Akhil Kondepudi,Haitham Alabsi,Emily Rudmann,Douglas Wilcox,Laura Brenner,Gregory K Robbins,Lidia Moura,Sahar Zafar,Nicole M Benson,John Hsu,John R Dickson,Alberto Serrano-Pozo,Bradley T Hyman,Deborah Blacker,M Brandon Westover,Shibani S Mukerji,Sudeshna Das
DOI: https://doi.org/10.2196/40384
IF: 7.076
2022-08-31
Journal of Medical Internet Research
Abstract:Background: Electronic health records (EHRs) with large sample sizes and rich information offer great potential for dementia research, but current methods of phenotyping cognitive status are not scalable. Objective: The aim of this study was to evaluate whether natural language processing (NLP)–powered semiautomated annotation can improve the speed and interrater reliability of chart reviews for phenotyping cognitive status. Methods: In this diagnostic study, we developed and evaluated a semiautomated NLP-powered annotation tool (NAT) to facilitate phenotyping of cognitive status. Clinical experts adjudicated the cognitive status of 627 patients at Mass General Brigham (MGB) health care, using NAT or traditional chart reviews. Patient charts contained EHR data from two data sets: (1) records from January 1, 2017, to December 31, 2018, for 100 Medicare beneficiaries from the MGB Accountable Care Organization and (2) records from 2 years prior to COVID-19 diagnosis to the date of COVID-19 diagnosis for 527 MGB patients. All EHR data from the relevant period were extracted; diagnosis codes, medications, and laboratory test values were processed and summarized; clinical notes were processed through an NLP pipeline; and a web tool was developed to present an integrated view of all data. Cognitive status was rated as cognitively normal, cognitively impaired, or undetermined. Assessment time and interrater agreement of NAT compared to manual chart reviews for cognitive status phenotyping was evaluated. Results: NAT adjudication provided higher interrater agreement (Cohen κ=0.89 vs κ=0.80) and significant speed up (time difference mean 1.4, SD 1.3 minutes; P
health care sciences & services,medical informatics