Hypothesis testing of one sample mean vector in distributed frameworks

Bin Du,Junlong Zhao,Xin Zhang
DOI: https://doi.org/10.1080/03610918.2024.2329992
2024-03-22
Communications in Statistics - Simulation and Computation
Abstract:Distributed frameworks are commonly used in the setting where data are stored in k different local machines and cannot be merged due to privacy protections or the huge sample size. For a random vector X∈Rp with expectation μ, testing the mean vector H0:μ=μ0 vs H1:μ≠μ0 for a given vector μ0 is a basic problem in statistics. In distributed frameworks, the computation of the centralized test statistics is not privacy-preserving and often requires heavy communication costs, which can be a burden when p or k is large. To deal with this problem, we extend two commonly used centralized test statistics to the distributed ones based on the divide and conquer technique. It is observed that the proposed test statistics are effective and can reduce communication costs and computation complexity. Numerical results confirm the theoretical findings.
statistics & probability
What problem does this paper attempt to address?