EHR-based Case Identification of Pediatric Long COVID: A Report from the RECOVER EHR Cohort

Morgan Botdorf,Kimberly Dickinson,Vitaly Lorman,Hanieh Razzaghi,Nicole Marchesani,Suchitra Rao,Colin Rogerson,Miranda J Higginbotham,Asuncion Mejias,Daria Salyakina,Deepika Thacker,Dima Dandachi,Dimitri Christakis,Emily Taylor,Hayden Schwenk,Hiroki Morizono,Jonathan Cogen,Nathan M Pajor,Ravi Jhaveri,Chris Forrest,Charles Bailey,RECOVER Consortium
DOI: https://doi.org/10.1101/2024.05.23.24307492
2024-08-26
Abstract:Objective Long COVID, marked by persistent, recurring, or new symptoms post-COVID-19 infection, impacts the well-being of children yet lacks a unified clinical definition. This study evaluates the performance of an empirically derived Long COVID case identification algorithm, or computable phenotype, with manual chart review in a pediatric sample. This approach aims to facilitate large-scale research efforts to understand this condition better. Methods The algorithm, composed of diagnostic codes empirically associated with Long COVID, was applied to a cohort of pediatric patients with SARS-CoV-2 infection in the RECOVER PCORnet EHR database. The algorithm classified 31,781 patients with conclusive, probable, or possible Long COVID and 307,686 patients without evidence of Long COVID. A chart review was performed on a subset of patients (n=651) to determine the overlap between the two methods. Instances of discordance were reviewed to understand the reasons for differences. Results The sample comprised 651 pediatric patients (339 females, Mage = 10.10 years) across 16 hospital systems. Results showed moderate overlap between phenotype and chart review Long COVID identification (accuracy = 0.62, PPV = 0.49, NPV = 0.75); however, there were also numerous cases of disagreement. No notable differences were found when the analyses were stratified by age at infection or era of infection. Further examination of the discordant cases revealed that the most common cause of disagreement was the clinician reviewers tendency to attribute Long COVID-like symptoms to prior medical conditions. The performance of the phenotype improved when prior medical conditions were considered (accuracy = 0.71, PPV = 0.65, NPV = 0.74). Conclusions Although there was moderate overlap between the two methods, the discrepancies between the two sources are likely attributed to the lack of consensus on a Long COVID clinical definition. It is essential to consider the strengths and limitations of each method when developing Long COVID classification algorithms.
Pediatrics
What problem does this paper attempt to address?