A Comparative Study of Design-Based and Analysis-Based Approaches to Causal Inference with Observational Data

Junni L. Zhang
DOI: https://doi.org/10.1080/24709360.2021.1992246
2021-01-01
Biostatistics & Epidemiology
Abstract:Causal inference with observational data is a central goal in many fields. Propensity score methods are design-based approaches that try to ensure covariate balance without using information from the outcome variables. Analysis-based approaches, such as the Bayesian Additive Regression Tree and the Causal Forest, bypass the issue of covariate balance, and directly model the outcomes. We use a Monte Carlo simulation to study the performance of these two types of approaches. Some of the simulation scenarios involve large number of covariates relative to the number of observations. We find that the analysis-based approaches can yield very poor performance, without any warning about not enough overlap between the covariate distributions for the treated and control groups. In contrast, the propensity score methods provide warning about not enough overlap, but such warning could be overly-cautious when there is enough overlap.
What problem does this paper attempt to address?