School Choice in NY City: A Bayesian Analysis of an Imperfect Randomized Experiment

John Barnard,Constantine Frangakis,Jennifer Hill,Donald B. Rubin
DOI: https://doi.org/10.1007/978-1-4613-0035-9_1
2002-01-01
Abstract: The precarious state of the educational system existing in the inner-cities of the U.S., including its potential causes and solutions, has been a popular topic of debate in recent years. Part of the difficulty in resolving this debate is the lack of solid empirical evidence regarding the true impact of educational initiatives. For example, educational researchers rarely are able to engage in controlled, randomized experiments. The efficacy of so-called “school choice” programs has been a particularly contentious issue. A current multi-million dollar evaluation of the New York School Choice Scholarship Program (NYSCSP) endeavors to shed some light on this issue. This study can be favorably contrasted with other school choice evaluations in terms of the consideration that went into the randomized experimental design (a completely new design, the Propensity Matched Pairs Design, is being implemented) and the rigorous data collection and compliance-encouraging efforts. In fact, this study benefits from the authors’ previous experiences with the analysis of data from the Milwaukee Parental Choice Program, which, although randomized, was relatively poorly implemented as an experiment. At first glance, it would appear that the evaluation of the NYSCSP could proceed without undue statistical complexity. However, this program evaluation, as is common in studies with human subjects, suffers from unintended, although not unanticipated, complications. The first complication is non-compliance. Approximately 25% of children who were awarded scholarships decided not to use them. The second complication is missing data: some parents failed to complete fully survey information; some children did not take pre-tests; some children failed to show up for post-tests. Levels of missing data range approximately from 3 to 50% across variables. Work by Frangakis and Rubin (1999) has revealed the severe threats to valid estimates of experimental effects that can exist in the presence of noncompliance and missing data, even for estimation of simple intention-to-treat effects. The technology we use to proceed with analyses of longitudinal data from a randomized experiment suffering from missing data and non-compliance involves the creation of multiple imputations for both missing outcomes and missing true compliance statuses using Bayesian models. The fitting of Bayesian models to such data requires MCMC methods for missing data. Our Bayesian approach allows for analyses that rely on fewer assumptions than standard approaches. These analyses provide evidence of small positive effects of private school attendance on math test scores for certain subgroups of the children studied.
What problem does this paper attempt to address?