Multiple imputation using auxiliary imputation variables that only predict missingness can increase bias due to data missing not at random

Elinor Curnow,Rosie P. Cornish,Jon E. Heron,James R. Carpenter,Kate Tilling
DOI: https://doi.org/10.1186/s12874-024-02353-9
2024-10-08
BMC Medical Research Methodology
Abstract:Epidemiological and clinical studies often have missing data, frequently analysed using multiple imputation (MI). In general, MI estimates will be biased if data are missing not at random (MNAR). Bias due to data MNAR can be reduced by including other variables ("auxiliary variables") in imputation models, in addition to those required for the substantive analysis. Common advice is to take an inclusive approach to auxiliary variable selection (i.e. include all variables thought to be predictive of missingness and/or the missing values). There are no clear guidelines about the impact of this strategy when data may be MNAR.
health care sciences & services
What problem does this paper attempt to address?