Abstract:Summary The design of any study, whether experimental or observational, that is intended to estimate the causal effects of a treatment condition relative to a control condition refers to those activities that precede any examination of outcome variables. As defined in our 1983 article (Rosenbaum & Rubin, 1983), the propensity score is the unit-level conditional probability of assignment to treatment versus control given the observed covariates; so the propensity score explicitly does not involve any outcome variables, in contrast to other summaries of variables sometimes used in observational studies. Balancing the distributions of covariates in the treatment and control groups by matching or balancing on the propensity score is therefore an aspect of the design of the observational study. In this invited comment on our 1983 article, we review the situation in the early 1980s and recall some apparent paradoxes that propensity scores helped to resolve. We demonstrate that it is possible to balance an enormous number of low-dimensional summaries of a high-dimensional covariate, even though it is generally impossible to match individuals closely for all the components of a high-dimensional covariate. In a sense, there is only one crucial observed covariate, the propensity score, and there is one crucial unobserved covariate, the principal unobserved covariate. The propensity score and the principal unobserved covariate are equal when treatment assignment is strongly ignorable, that is, unconfounded. Controlling for observed covariates is a prelude to the crucial step from association to causation, the step that addresses potential biases from unmeasured covariates. The design of an observational study also prepares for the step to causation: by selecting comparisons to increase the design sensitivity, by seeking opportunities to detect bias, by seeking mutually supportive evidence affected by different biases, by incorporating quasi-experimental devices such as multiple control groups, and by including the economist’s instruments. All of these considerations reflect the formal development of sensitivity analyses that were largely informal prior to the 1980s.

Estimating and Using Propensity Scores with Partially Missing Data

Estimating the Propensity Score

Propensity scores using missingness pattern information: a practical guide

Propensity Score Method: a Non-Parametric Technique to Reduce Model Dependence

The central role of the propensity score in observational studies for causal effects

Propensity Score Methods for Creating Covariate Balance in Observational Studies

Using Propensity Scores to Estimate Effects of Treatment Initiation Decisions: State of the Science

Matching Using Estimated Propensity Scores: Relating Theory to Practice

Covariate-adjusted Survival Analyses in Propensity-Score Matched Samples: Imputing Potential Time-to-event Outcomes.

Balance Diagnostics after Propensity Score Matching.

Propensity Score Weighting with Missing Data on Covariates and Clustered Data Structure

Combining Propensity Score Matching with Additional Adjustments for Prognostic Covariates

Robust propensity score weighting estimation under missing at random

Propensity Score Analysis With Baseline and Follow-Up Measurements of the Outcome Variable

Estimating Adjusted Risk Differences by Multiply‐imputing Missing Control Binary Potential Outcomes Following Propensity Score‐matching

Propensity Score Augmentation in Matching-based Estimation of Causal Effects

Propensity scores in the design of observational studies for causal effects

Propensity score analysis for time-dependent exposure

[Application of Propensity Score Matching in the Design of an Epidemiological Study].

Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score

Subgroup Balancing Propensity Score