Bayesian causal inference for count potential outcomes

Young Lee,Wicher P. Bergsma,Marie-Abele C. Bind
DOI: https://doi.org/10.48550/arXiv.2008.03271
2020-08-08
Abstract:The literature for count modeling provides useful tools to conduct causal inference when outcomes take non-negative integer values. Applied to the potential outcomes framework, we link the Bayesian causal inference literature to statistical models for count data. We discuss the general architectural considerations for constructing the predictive posterior of the missing potential outcomes. Special considerations for estimating average treatment effects are discussed, some generalizing certain relationships and some not yet encountered in the causal inference literature.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively estimate the Average Treatment Effect (ATE) when the potential outcomes are count data (i.e., non - negative integer values) under the Bayesian causal inference framework. Specifically, the paper focuses on how to predict the missing potential outcomes by constructing appropriate statistical models and evaluate the treatment effect based on these predicted values when the treatment assignment mechanism is known. The paper particularly emphasizes the characteristics of count data, such as overdispersion, that is, the situation where the variance is greater than the mean, which is common in settings where there are heterogeneous units or dependencies between events. ### Main Contributions 1. **Constructing a Bayesian Causal Inference Framework for Count Potential Outcomes**: - The paper proposes a Bayesian causal inference framework for count potential outcomes, which can handle causal inference problems of non - negative integer outcomes. - By introducing the Poisson distribution and the lognormal - Poisson distribution as distribution assumptions for potential outcomes, the model can effectively handle overdispersed data. 2. **Proposing an Efficient Approximation Algorithm**: - To overcome the computational bottleneck of traditional Markov Chain Monte Carlo (MCMC) methods on large - scale data sets, the paper develops a new approximation algorithm, which is several orders of magnitude faster than the exact Hamiltonian Monte Carlo (HMC) method. - Through theoretical analysis, the paper proves the convergence rate of the total variation distance (TVD) between this approximation algorithm and the true posterior distribution. 3. **Providing Specific Implementation Steps**: - The paper describes in detail how to estimate the average treatment effect through a four - step method: 1. Evaluate the conditional distribution of the missing potential outcomes. 2. Evaluate the conditional joint distribution of the parameters. 3. Calculate the distribution of the missing potential outcomes by marginalizing the parameters and hyper - parameters. 4. Finally, calculate the estimated value of the average treatment effect. 4. **Case Studies**: - The paper demonstrates the application of the model through two specific cases, one is the Poisson potential outcome model, and the other is the lognormal - Poisson potential outcome model. - The Poisson model shows the exact distribution form of the average treatment effect under specific conditions. - The lognormal - Poisson model shows how to estimate the average treatment effect through effective simulation methods in more complex situations. ### Conclusion By constructing a Bayesian causal inference framework suitable for count data, the paper provides a new method to handle missing potential outcomes and solves the computational challenges through an efficient approximation algorithm. This framework is not only applicable to Poisson - distributed data, but also can handle count data with overdispersion characteristics, providing a powerful tool for causal inference in practical applications.