Abstract:Classical randomized experiments, equipped with randomization-based inference, provide assumption-free inference for treatment effects. They have been the gold standard for drawing causal inference and provide excellent internal validity. However, they have also been criticized for questionable external validity, in the sense that the conclusion may not generalize well to a larger population. The randomized survey experiment is a design tool that can help mitigate this concern, by randomly selecting the experimental units from the target population of interest. However, as pointed out by Morgan and Rubin (2012), chance imbalances often exist in covariate distributions between different treatment groups even under completely randomized experiments. Not surprisingly, such covariate imbalances also occur in randomized survey experiments. Furthermore, the covariate imbalances happen not only between different treatment groups, but also between the sampled experimental units and the overall population of interest. In this paper, we propose a two-stage rerandomization design that can actively avoid undesirable covariate imbalances at both the sampling and treatment assignment stages. We further develop asymptotic theory for rerandomized survey experiments, demonstrating that rerandomization provides better covariate balance, more precise treatment effect estimators, and shorter large-sample confidence intervals. We also propose covariate adjustment to deal with remaining covariate imbalances after rerandomization, showing that it can further improve both the sampling and estimated precision. Our work allows general relationship among covariates at the sampling, treatment assignment and analysis stages, and generalizes both rerandomization in classical randomized experiments (Morgan and Rubin 2012) and rejective sampling in survey sampling (Fuller 2009).

Post-randomization for Controlling Identification Risk in Releasing Microdata from General Surveys

A Tutorial in Assessing Disclosure Risk in Microdata

Releasing survey microdata with exact cluster locations and additional privacy safeguards

Private Tabular Survey Data Products Through Synthetic Microdata Generation

Nonparametric model-assisted estimation from randomized response survey data

Rejective Sampling, Rerandomization and Regression Adjustment in Survey Experiments

The Role of Chance in the Census Bureau Database Reconstruction Experiment

Identifiability of Subgroup Causal Effects in Randomized Experiments with Nonignorable Missing Covariates

Randomization Resilient To Sensitive Reconstruction

Disclosure Risk Assessment in Perturbative Microdata Protection

Optimal disclosure risk assessment

Quantifying Privacy Risks of Public Statistics to Residents of Subsidized Housing

Asymptotic Theory of Rerandomization in Treatment-Control Experiments

Respondent privacy and estimation efficiency in randomized response surveys for discrete-valued sensitive variables

Statistical disclosure control for numeric microdata via sequential joint probability preserving data shuffling

A Novel Microdata Privacy Disclosure Risk Measure

Evaluating bias and noise induced by the U.S. Census Bureau's privacy protection methods

Randomization Tests for Adaptively Collected Data

Escalation of Commitment: A Case Study of the United States Census Bureau Efforts to Implement Differential Privacy for the 2020 Decennial Census

Ensuring anonymity in survey panel research

Assessing Statistical Disclosure Risk for Differentially Private, Hierarchical Count Data, with Application to the 2020 U.S. Decennial Census