Causal Inference When Counterfactuals Depend on the Proportion of All Subjects Exposed

Caleb H. Miles,Maya Petersen,Mark J. van der Laan
DOI: https://doi.org/10.48550/arXiv.1710.09588
2018-11-24
Abstract:The assumption that no subject's exposure affects another subject's outcome, known as the no-interference assumption, has long held a foundational position in the study of causal inference. However, this assumption may be violated in many settings, and in recent years has been relaxed considerably. Often this has been achieved with either the aid of a known underlying network, or the assumption that the population can be partitioned into separate groups, between which there is no interference, and within which each subject's outcome may be affected by all the other subjects in the group via the proportion exposed (the stratified interference assumption). In this paper, we instead consider a complete interference setting, in which each subject affects every other subject's outcome. In particular, we make the stratified interference assumption for a single group consisting of the entire sample. This can occur when the exposure is a shared resource whose efficacy is modified by the number of subjects among whom it is shared. We show that a targeted maximum likelihood estimator for the i.i.d.~setting can be used to estimate a class of causal parameters that includes direct effects and overall effects under certain interventions. This estimator remains doubly-robust, semiparametric efficient, and continues to allow for incorporation of machine learning under our model. We conduct a simulation study, and present results from a data application where we study the effect of a nurse-based triage system on the outcomes of patients receiving HIV care in Kenyan health clinics.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to make causal inferences in the presence of interference (that is, the exposure of one subject may affect the results of another subject). Specifically, the paper focuses on how to estimate causal effects in the complete interference setting (each subject may affect the results of all other subjects). Traditional causal inference methods usually assume no interference (that is, the exposure of one subject will not affect the results of other subjects), but this assumption does not hold in many practical situations. For example, when the exposure is a shared resource, its effect may change as the number of people sharing the resource changes. To deal with this situation, the paper proposes a stratified interference assumption, that is, the result of each subject may be affected by the exposure of all other subjects, but this effect is only reflected through the proportion of exposure. Based on this assumption, the paper develops a Targeted Maximum Likelihood Estimator (TMLE) to estimate a class of causal parameters, including direct effects and overall effects. This estimator still has double robustness and semi - parametric efficiency in the complete interference setting, and allows the combination of machine - learning methods to estimate the interference function. The paper also verifies the effectiveness of the proposed method through simulation studies and practical data applications (studying the impact of the nurse - led triage system in Kenyan clinics on the risk of death or loss to follow - up of patients receiving HIV care).