Causal modeling in large-scale data to improve identification of adults at risk for combined and common variable immunodeficiencies
Giorgos Papanastasiou,Marco Scutari,Raffi Tachdjian,Vivian Hernandez-Trujillo,Jason Raasch,Kaylyn Billmeyer,Nikolay V Vasilyev,Vladimir Ivanov
DOI: https://doi.org/10.1101/2024.08.08.24311672
2024-08-09
Abstract:Combined immunodeficiencies (CID) and common variable immunodeficiencies (CVID), prevalent yet substantially underdiagnosed primary immunodeficiency disorders, necessitate improved early detection strategies. Leveraging large-scale electronic health record (EHR) data from four nationwide US cohorts, we developed a novel causal Bayesian Network (BN) model to unravel the complex interplay of antecedent clinical phenotypes associated with CID/CVID. Consensus directed acyclic graphs (DAGs) were constructed, which demonstrated robust predictive performance (ROC AUC in unseen data within each cohort ranged from 0.77-0.61) and generalizability (ROC AUC across all unseen cohort evaluations ranged from 0.72-0.56) in identifying CID/CVID across diverse patient populations, created using different inclusion criteria. These consensus DAGs elucidate causal relationships between comorbidities preceding CID/CVID diagnosis, including autoimmune and blood disorders, lymphomas, organ damage or inflammation, respiratory conditions, genetic anomalies, recurrent infections, and allergies. Further evaluation through causal inference and by expert clinical immunologists substantiates the clinical relevance of the identified phenotypic trajectories within the consensus DAGs. These findings hold promise for translation into improved clinical practice, potentially leading to earlier identification and intervention for adults at risk of CID/CVID.
Allergy and Immunology