Communication‐Efficient Distributed Estimation of Causal Effects With High‐Dimensional Data

Xiaohan Wang,Jiayi Tong,Sida Peng,Yong Chen,Yang Ning
DOI: https://doi.org/10.1002/sta4.70006
2024-09-10
Stat
Abstract:We propose a communication‐efficient algorithm to estimate the average treatment effect (ATE), when the data are distributed across multiple sites and the number of covariates is possibly much larger than the sample size in each site. Our main idea is to calibrate the estimates of the propensity score and outcome models using some proper surrogate loss functions to approximately attain the desired covariate balancing property. We show that under possible model misspecification, our distributed covariate balancing propensity score estimator (disthdCBPS) can approximate the global estimator, obtained by pooling together the data from multiple sites, at a fast rate. Thus, our estimator remains consistent and asymptotically normal. In addition, when both the propensity score and the outcome models are correctly specified, the proposed estimator attains the semi‐parametric efficiency bound. We illustrate the empirical performance of the proposed method in both simulation and empirical studies.
statistics & probability
What problem does this paper attempt to address?