Prediction de‐correlated inference: A safe approach for post‐prediction inference

Feng Gan,Wanfeng Liang,Changliang Zou
DOI: https://doi.org/10.1111/anzs.12429
2024-10-26
Australian & New Zealand Journal of Statistics
Abstract:Summary In modern data analysis, it is common to use machine learning methods to predict outcomes on unlabelled datasets and then use these pseudo‐outcomes in subsequent statistical inference. Inference in this setting is often called post‐prediction inference. We propose a novel assumption‐lean framework for statistical inference under post‐prediction setting, called prediction de‐correlated inference (PDC). Our approach is safe, in the sense that PDC can automatically adapt to any black‐box machine‐learning model and consistently outperform the supervised counterparts. The PDC framework also offers easy extensibility for accommodating multiple predictive models. Both numerical results and real‐world data analysis demonstrate the superiority of PDC over the state‐of‐the‐art methods.
statistics & probability
What problem does this paper attempt to address?