Abstract:Matched sampling is a standard technique in the evaluation of treatments in observational studies. Matching on estimated propensity scores comprises an important class of procedures when there are numerous matching variables. Recent theoretical work (Rubin, D. B. and Thomas, N., 1992, The Annals of Statistics 20, 1079-1093) on affinely invariant matching methods with ellipsoidal distributions provides a general framework for evaluating the operating characteristics of such methods. Moreover, Rubin and Thomas (1992, Biometrika 79, 797-809) uses this framework to derive several analytic approximations under normality for the distribution of the first two moments of the matching variables in samples obtained by matching on estimated linear propensity scores. Here we provide a bridge between these theoretical approximations and actual practice. First, we complete and refine the nomal-based analytic approximations, thereby making it possible to apply these results to practice. Second, we perform Monte Carlo evaluations of the analytic results under normal and nonnormal ellipsoidal distributions, which confirm the accuracy of the analytic approximations, and demonstrate the predictable ways in which the approximations deviate from simulation results when normal assumptions are violated within the ellipsoidal family. Third, we apply the analytic approximations to real data with clearly nonellipsoidal distributions, and show that the theoretical expressions, although derived under artificial distributional conditions, produce useful guidance for practice. Our results delineate the wide range of settings in which matching on estimated linear propensity scores performs well, thereby providing useful information for the design of matching studies. When matching with a particular data set, our theoretical approximations provide benchmarks for expected performance under favorable conditions, thereby identifying matching variables requiring special treatment. After matching is complete and data analysis is at hand, our results provide the variances required to compute valid standard errors for common estimators.

Mistaken identities lead to missed opportunities: Testing for mean differences in partially matched data

A Comparative Review of Methods for Comparing Means Using Partially Paired Data

Matching and Regression to the Mean in Difference‐in‐Differences Analysis

Testing Biased Randomization Assumptions and Quantifying Imperfect Matching and Residual Confounding in Matched Observational Studies

The power of a paired t-test with a covariate

Propensity Score Method: a Non-Parametric Technique to Reduce Model Dependence

Weighted Mean Difference Statistics for Paired Data in Presence of Missing Values

Testing for a difference in means of a single feature after clustering

Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference

Statistical Tests for Proportion Difference in One-to-Two Matched Binary Diagnostic Data: Application to Environmental Testing of Salmonella in the United States

Matching Using Estimated Propensity Scores: Relating Theory to Practice

Inference in Experiments with Matched Pairs and Imperfect Compliance

Intrinsic tests for the equality of two correlated proportions

Powerful Test of Heterogeneity in Two‐Sample Summary‐Data Mendelian Randomization

An Inverse Normal Transformation Solution for the comparison of two samples that contain both paired observations and independent observations

More Powerful Multiple Testing in Randomized Experiments with Non-Compliance

Estimating and Using Propensity Scores with Partially Missing Data

Comparison of paired ordinal data with mis-classification and covariates adjustment

On Testing the Mean Equivalence of Treatments from Correlated Normal Populations

The Bias Due to Incomplete Matching.

The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation