Abstract:Background: Post-COVID-19 condition (colloquially known as "long COVID-19") characterized as postacute sequelae of SARS-CoV-2 has no universal clinical case definition. Recent efforts have focused on understanding long COVID-19 symptoms, and electronic health record (EHR) data provide a unique resource for understanding this condition. The introduction of the International Classification of Diseases, Tenth Revision (ICD-10) code U09.9 for "Post COVID-19 condition, unspecified" to identify patients with long COVID-19 has provided a method of evaluating this condition in EHRs; however, the accuracy of this code is unclear. Objective: This study aimed to characterize the utility and accuracy of the U09.9 code across 3 health care systems-the Veterans Health Administration, the Beth Israel Deaconess Medical Center, and the University of Pittsburgh Medical Center-against patients identified with long COVID-19 via a chart review by operationalizing the World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC) definitions. Methods: Patients who were COVID-19 positive with either a U07.1 ICD-10 code or positive polymerase chain reaction test within these health care systems were identified for chart review. Among this cohort, we sampled patients based on two approaches: (1) with a U09.9 code and (2) without a U09.9 code but with a new onset long COVID-19-related ICD-10 code, which allows us to assess the sensitivity of the U09.9 code. To operationalize the long COVID-19 definition based on health agency guidelines, symptoms were grouped into a "core" cluster of 11 commonly reported symptoms among patients with long COVID-19 and an extended cluster that captured all other symptoms by disease domain. Patients having ≥2 symptoms persisting for ≥60 days that were new onset after their COVID-19 infection, with ≥1 symptom in the core cluster, were labeled as having long COVID-19 per chart review. The code's performance was compared across 3 health care systems and across different time periods of the pandemic. Results: Overall, 900 patient charts were reviewed across 3 health care systems. The prevalence of long COVID-19 among the cohort with the U09.9 ICD-10 code based on the operationalized WHO definition was between 23.2% and 62.4% across these health care systems. We also evaluated a less stringent version of the WHO definition and the CDC definition and observed an increase in the prevalence of long COVID-19 at all 3 health care systems. Conclusions: This is one of the first studies to evaluate the U09.9 code against a clinical case definition for long COVID-19, as well as the first to apply this definition to EHR data using a chart review approach on a nationwide cohort across multiple health care systems. This chart review approach can be implemented at other EHR systems to further evaluate the utility and performance of the U09.9 code.

EHR-based Case Identification of Pediatric Long COVID: A Report from the RECOVER EHR Cohort

Pediatric Long COVID Subphenotypes: An EHR-based study from the RECOVER program

Long COVID incidence in adults and children between 2020 and 2023: a real-world data study from the RECOVER Initiative

Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative

Understanding Post-Acute Sequelae of SARS-CoV-2 Infection Through Data-Driven Analysis with the Longitudinal Electronic Health Records: Findings from the RECOVER Initiative

Characterization of Post-COVID-19 Definitions and Clinical Coding Practices: Longitudinal Study

Identification of risk factors of Long COVID and predictive modeling in the RECOVER EHR cohorts

Characterizing long COVID in an international cohort: 7 months of symptoms and their impact

Attributes and predictors of long COVID

A population data-driven approach to identifying ‘Long COVID’ cases in support of diagnosis and treatment.

Precision Phenotyping for Curating Research Cohorts of Patients with Post-Acute Sequelae of COVID-19 (PASC) as a Diagnosis of Exclusion

Coding Long COVID: Characterizing a new disease through an ICD-10 lens

Machine Learning for Identifying Data-Driven Subphenotypes of Incident Post-Acute SARS-CoV-2 Infection Conditions with Large Scale Electronic Health Records: Findings from the RECOVER Initiative.

Vital Measurements of Hospitalized COVID-19 Patients as a Predictor of Long COVID: An EHR-based Cohort Study from the RECOVER Program in N3C

Analyzing historical diagnosis code data from NIH N3C and RECOVER Programs using deep learning to determine risk factors for Long Covid

Characterizing Long COVID in Children and Adolescents

Characterizing Long COVID: Deep Phenotype of a Complex Condition

Clinical coding of long COVID in primary care 2020-2023 in a cohort of 19 million adults: an OpenSAFELY analysis

Finding Long-COVID: Temporal Topic Modeling of Electronic Health Records from the N3C and RECOVER Programs

Post COVID-19 conditions in an Australian pediatric cohort, 3 months following a Delta outbreak

Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data