Abstract: The precarious state of the educational system existing in the inner-cities of the U.S., including its potential causes and solutions, has been a popular topic of debate in recent years. Part of the difficulty in resolving this debate is the lack of solid empirical evidence regarding the true impact of educational initiatives. For example, educational researchers rarely are able to engage in controlled, randomized experiments. The efficacy of so-called “school choice” programs has been a particularly contentious issue. A current multi-million dollar evaluation of the New York School Choice Scholarship Program (NYSCSP) endeavors to shed some light on this issue. This study can be favorably contrasted with other school choice evaluations in terms of the consideration that went into the randomized experimental design (a completely new design, the Propensity Matched Pairs Design, is being implemented) and the rigorous data collection and compliance-encouraging efforts. In fact, this study benefits from the authors’ previous experiences with the analysis of data from the Milwaukee Parental Choice Program, which, although randomized, was relatively poorly implemented as an experiment. At first glance, it would appear that the evaluation of the NYSCSP could proceed without undue statistical complexity. However, this program evaluation, as is common in studies with human subjects, suffers from unintended, although not unanticipated, complications. The first complication is non-compliance. Approximately 25% of children who were awarded scholarships decided not to use them. The second complication is missing data: some parents failed to complete fully survey information; some children did not take pre-tests; some children failed to show up for post-tests. Levels of missing data range approximately from 3 to 50% across variables. Work by Frangakis and Rubin (1999) has revealed the severe threats to valid estimates of experimental effects that can exist in the presence of noncompliance and missing data, even for estimation of simple intention-to-treat effects. The technology we use to proceed with analyses of longitudinal data from a randomized experiment suffering from missing data and non-compliance involves the creation of multiple imputations for both missing outcomes and missing true compliance statuses using Bayesian models. The fitting of Bayesian models to such data requires MCMC methods for missing data. Our Bayesian approach allows for analyses that rely on fewer assumptions than standard approaches. These analyses provide evidence of small positive effects of private school attendance on math test scores for certain subgroups of the children studied.

A modified general location model for noncompliance with missing data: Revisiting the New York City School Choice Scholarship Program using principal stratification

A Modified General Location Model for Noncompliance With Missing Data: Revisiting the New York City School Choice Scholarship Program Using Principal Stratification

A Modified General Location Model for Noncompliance With Missing Data

Principal stratification approach to broken randomized experiments: A case study of School Choice vouchers in New York City - Rejoinder

Principal stratification approach to broken randomized experiments: A case study of school choice vouchers in New York City. Comments. Authors' reply

Principal Stratification Approach to Broken Randomized Experiments

School Choice in NY City: A Bayesian Analysis of an Imperfect Randomized Experiment

Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models with Local Dependence

Breaking Ties: Regression Discontinuity Design Meets Market Design

Fully Latent Principal Stratification With Measurement Models

Principal stratification analysis of noncompliance with time-to-event outcomes

Principal Stratification for Causal Inference with Extended Partial Compliance

Principal Stratification with Continuous Post-Treatment Variables: Nonparametric Identification and Semiparametric Estimation

Mixed‐Effect Hybrid Models for Longitudinal Data with Nonignorable Dropout

A direct likelihood approach to principal stratification analysis

Bayesian Nonparametrics for Principal Stratification with Continuous Post-Treatment Variables

Identification and estimation of causal effects in the presence of confounded principal strata

Leveraging Uncertainties to Infer Preferences: Robust Analysis of School Choice

Multivariate Shared-Parameter Mixed-Effects Location Scale Model for Analysis of Intensive Longitudinal Data

Partial identification of principal causal effects under violations of principal ignorability

GEEPERs: Principal Stratification using Principal Scores and Stacked Estimating Equations