Abstract:<h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Introduction</h3><p>Drug safety research asks causal questions but relies on observational data. Confounding bias threatens the reliability of studies using such data. The successful control of confounding requires knowledge of variables called confounders affecting both the exposure and outcome of interest. Causal knowledge of dynamic biological systems is complex and challenging. Fortunately, computable knowledge mined from the literature may hold clues about confounders. In this paper, we tested the hypothesis that incorporating literature-derived confounders can improve causal inference from observational data.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Methods</h3><p>We introduce two methods (semantic vector-based and string-based confounder search) that query literature-derived information for confounder candidates to control, using SemMedDB, a database of computable knowledge mined from the biomedical literature. These methods search SemMedDB for confounders by applying semantic constraint search for indications treated by the drug (exposure), that are also known to cause the adverse event (outcome). We then include the literature-derived confounder candidates in statistical and causal models derived from free-text clinical notes. For evaluation, we use a reference dataset widely used in drug safety containing labeled pairwise relationships between drugs and adverse events and attempt to rediscover these relationships from a corpus of 2.2M NLP-processed free-text clinical notes. We employ standard adjustment and causal inference procedures to predict and estimate causal effects by informing the models with varying numbers of literature-derived confounders and instantiating the exposure, outcome, and confounder variables in the models with dichotomous EHR-derived data. Finally, we compare the results from applying these procedures with naive measures of association (<em>χ</em><sup>2</sup> and reporting odds ratio) and with each other.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Results and Conclusions</h3><p>We found semantic vector-based search to be superior to string-based search at reducing confounding bias. However, the effect of including more rather than fewer literature-derived confounders was inconclusive. We recommend using targeted learning estimation methods that can address treatment-confounder feedback, where confounders that also behave as intermediate variables, and engaging subject-matter experts to adjudicate the handling of problematic confounders.</p>

Meta-Analysis with Untrusted Data

Random-Effect Meta-Analysis with Robust Between-Study Variance

A Unified Method for Improved Inference in Random-effects Meta-analysis

Causally-interpretable meta-analysis: clearly-defined causal effects and two case studies

Systematically Missing Data in Causally Interpretable Meta-Analysis

Dynamically borrowing strength from another study through shrinkage estimation

Estimating individual treatment effect: generalization bounds and algorithms

Meta-analysis of clinical trials in the 2020s and beyond: a paradigm shift needed

Advanced Methods and Implementations for the Meta-Analyses of Animal Models: Current Practices and Future Recommendations

A meta-analytic framework to adjust for bias in external control studies

Exact inference for the random‐effect model for meta‐analyses with rare events

Label-invariant models for the analysis of meta-epidemiological data

Robust inference methods for meta‐analysis involving influential outlying studies

Combining Cox Regressions Across a Heterogeneous Distributed Research Network Facing Small and Zero Counts

The Meta-Analysis: Supportive or Illuminating?

Robust inference for the unification of confidence intervals in meta-analysis

A re-analysis of about 60,000 sparse data meta-analyses suggests that using an adequate method for pooling matters

Precise unbiased estimation in randomized experiments using auxiliary observational data

Using computable knowledge mined from the literature to elucidate confounders for EHR-based pharmacovigilance

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

Collaborative Heterogeneous Causal Inference Beyond Meta-analysis