Abstract:Frequentists' inference often delivers point estimators associated with confidence intervals or sets for parameters of interest. Constructing the confidence intervals or sets requires understanding the sampling distributions of the point estimators, which, in many but not all cases, are related to asymptotic Normal distributions ensured by central limit theorems. Although previous literature has established various forms of central limit theorems for statistical inference in super population models, we still need general and convenient forms of central limit theorems for some randomization-based causal analysis of experimental data, where the parameters of interests are functions of a finite population and randomness comes solely from the treatment assignment. We use central limit theorems for sample surveys and rank statistics to establish general forms of the finite population central limit theorems that are particularly useful for proving asymptotic distributions of randomization tests under the sharp null hypothesis of zero individual causal effects, and for obtaining the asymptotic repeated sampling distributions of the causal effect estimators. The new central limit theorems hold for general experimental designs with multiple treatment levels and multiple treatment factors, and are immediately applicable for studying the asymptotic properties of many methods in causal inference, including instrumental variable, regression adjustment, rerandomization, clustered randomized experiments, and so on. Previously, the asymptotic properties of these problems are often based on heuristic arguments, which in fact rely on general forms of finite population central limit theorems that have not been established before. Our new theorems fill in this gap by providing more solid theoretical foundation for asymptotic randomization-based causal inference.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to establish Finite Population Central Limit Theorems (FPCLTs) in general form applicable to multiple treatment levels and complex designs when making causal inferences in randomized experiments. Specifically, the paper focuses on the asymptotic distribution problem of parameter estimation in experimental data with randomly assigned treatments, where the parameter is a function of the finite population and the randomness only comes from the treatment assignment. Existing central limit theorems are mainly for statistical inferences in super - population models, but in some randomized - based causal analyses, these theorems are not directly applicable.
### Main Contributions:
1. **FPCLTs in General Form**: The paper establishes finite population central limit theorems applicable to multiple treatment levels and multiple treatment factors. These theorems are particularly suitable for proving the asymptotic distribution of randomization tests under the sharp null hypothesis of zero individual causal effects and obtaining the asymptotic resampling distribution of causal effect estimators.
2. **Theoretical Basis**: These new FPCLTs fill the gaps in the existing literature and provide a more solid theoretical basis for randomized - based causal inferences. The asymptotic properties of many existing causal inference methods (such as instrumental variables, regression adjustment, rerandomization, clustered randomized experiments, etc.) are usually based on heuristic arguments, which actually rely on the not - yet - established finite population central limit theorems in general form.
3. **Wide Application**: The new FPCLTs can be directly applied to study the asymptotic properties of many causal inference methods, including but not limited to instrumental variable estimation, randomized tests with multiple treatment levels, multiple randomized tests, rerandomization to ensure covariate balance, regression adjustment in completely randomized experiments, clustered randomized experiments, and unbalanced factorial experiments.
### Key Concepts:
- **Finite Population Central Limit Theorems (FPCLTs)**: A theorem that describes the distribution of the sample mean in a finite population gradually approaching a normal distribution as the sample size increases.
- **Randomized Inference**: A method of statistical testing and estimation through the randomization distribution based on the experimental design of randomly assigned treatments.
- **Sharp Null Hypothesis**: An assumption that the individual causal effects of all units are zero or known.
### Mathematical Formulas:
- **Finite Population Variance**:
\[
v_N=\frac{1}{N - 1}\sum_{i = 1}^N(y_{Ni}-\bar{y}_N)^2
\]
- **Maximum Squared Distance**:
\[
m_N=\max_{1\leq i\leq N}(y_{Ni}-\bar{y}_N)^2
\]
- **Condition (4)**:
\[
\frac{1}{\min(n, N - n)}\cdot\frac{m_N}{v_N}\to0
\]
### Conclusion:
By establishing finite population central limit theorems applicable to multiple treatment levels and complex designs, this paper provides a solid theoretical basis for randomized - based causal inferences and helps to understand and verify the asymptotic properties of many existing causal inference methods. These theorems are applicable not only to simple two - treatment - level experiments but also can be extended to more complex experimental designs and estimation methods.