Nonstandard Conditionally Specified Models for Nonignorable Missing Data.

Alexander M. Franks,Edoardo M. Airoldi,Donald B. Rubin
DOI: https://doi.org/10.1073/pnas.1815563117
IF: 11.1
2020-01-01
Proceedings of the National Academy of Sciences
Abstract:Data analyses typically rely upon assumptions about missingness mechanisms that lead to observed versus missing data. When the data are missing not at random, direct assumptions about the missingness mechanism, and indirect assumptions about the distributions of observed and missing data, are typically untestable. We explore an approach, where the joint distribution of observed data and missing data is specified through non-standard conditional distributions. In this formulation, which traces back to a factorization of the joint distribution, apparently proposed by J.W. Tukey, the modeling assumptions about the conditional factors are either testable or are designed to allow the incorporation of substantive knowledge about the problem at hand, thereby offering a possibly realistic portrayal of the data, both missing and observed. We apply Tukey's conditional representation to exponential family models, and we propose a computationally tractable inferential strategy for this class of models. We illustrate the utility of this approach using high-throughput biological data with missing data that are not missing at random.
What problem does this paper attempt to address?