Data Mining Pipeline for COVID-19 Vaccine Safety Analysis Using a Large Electronic Health Record

Yan Huang,Xiaojin Li,Deepa Dongarwar,Hulin Wu,Guo-Qiang Zhang
2023-06-16
Abstract:We developed a novel data mining pipeline that automatically extracts potential COVID-19 vaccine-related adverse events from a large Electronic Health Record (EHR) dataset. We applied this pipeline to Optum® de-identified COVID-19 EHR dataset containing COVID-19 vaccine records between December 11, 2020 and January 20, 2022. We compared post-vaccination diagnoses between the COVID-19 vaccine group and the influenza vaccine group among 553,682 individuals without COVID-19 infection. We extracted 1,414 ICD-10 diagnosis categories (first three ICD10 digits) within 180 days after the first dose of the COVID-19 vaccine. We then ranked the diagnosis codes using the adverse event rates and adjusted odds ratio based on the self-controlled case series analysis. Using inverse probability of censoring weighting, we estimated the right-censored time-to-event records. Our results show that the COVID-19 vaccine has a similar adverse events rate to the influenza vaccine. We found 20 types of potential COVID-19 vaccine-related adverse events that may need further investigation.
What problem does this paper attempt to address?