Principal Stratification, Partial Contingency Table, and Statistical Leverage
Yu Xie,S. Murphy
2007-01-01
Abstract:MaKenzie (hereafter FRAM) propose an analysis to do what may seem impossible: to recover input data that are missing due to death and then use the (observed and missing) input data to predict death. FRAM show that, under certain assumptions, this can be done with the introduction of an additional variable, " treatment, " that possesses certain desirable properties. We organize our comments as follows. First, we present the logic behind FRAM's analysis from the perspective of contingency table analysis. Second, with insights from this perspective, we will consider the implications of FRAM's analysis. Third, we discuss some considerations that should be taken into account in practice. It appears that FRAM's analysis hinges on the notion of principal stratification (Angrist, Imbens, and Rubin 1996; Frangakis and Rubin 2002), i.e., the idea that discrete subpopulations, or strata, have distinct patterns of response to a treatment (called Z in the paper). For simplicity, we focus on the main case discussed by FRAM: there are only two strata: a stratum of " always survivors " regardless of the treatment, and another stratum of " protectable " patients whose lives can be saved, but who cannot be harmed, by the treatment. Here the principal stratification assumption can be replaced by a less restrictive assumption: Assumption 2'. If treatment is Z=1 then the person must be alive at 3 months (S=1) or, equivalently, P[S=1|Z=1]=1. Assumption 2' is true if FRAM's assumption 2 is true, but assumption 2' invokes neither potential outcomes nor principal stratification. The crucial ignorability assumption 1 of FRAM is that the assignment of Z is independent of both stratum membership and input data (A), conditional on covariate X (see below for more on this assumption). We note that covariate X plays no special role in FRAM's paper except to make the ignorability assumption plausible. Thus, the discussion that follows is conditional on X. In terms of time ordering, the