Estimating causal effects from large data sets using propensity scores

Donald B Rubin
1997-10-15
Abstract:The aim of many analyses of large databases is to draw causal inferences about the effects of actions, treatments, or interventions. Examples include the effects of various options available to a physician for treating a particular patient, the relative efficacies of various health care providers, and the consequences of implementing a new national health care policy. A complication of using large databases to achieve such aims is that their data are almost always observational rather than experimental. That is, the data in most large data sets are not based on the results of carefully conducted randomized clinical trials, but rather represent data collected through the observation of systems as they operate in normal practice without any interventions implemented by randomized assignment rules. Such data are relatively inexpensive to obtain, however, and often do represent the spectrum of medical practice better than the …
What problem does this paper attempt to address?