Counterfactual Prediction Under Outcome Measurement Error

Luke Guerdan,Amanda Coston,Kenneth Holstein,Zhiwei Steven Wu
DOI: https://doi.org/10.1145/3593013.3594101
2023-05-18
Abstract:Across domains such as medicine, employment, and criminal justice, predictive models often target labels that imperfectly reflect the outcomes of interest to experts and policymakers. For example, clinical risk assessments deployed to inform physician decision-making often predict measures of healthcare utilization (e.g., costs, hospitalization) as a proxy for patient medical need. These proxies can be subject to outcome measurement error when they systematically differ from the target outcome they are intended to measure. However, prior modeling efforts to characterize and mitigate outcome measurement error overlook the fact that the decision being informed by a model often serves as a risk-mitigating intervention that impacts the target outcome of interest and its recorded proxy. Thus, in these settings, addressing measurement error requires counterfactual modeling of treatment effects on outcomes. In this work, we study intersectional threats to model reliability introduced by outcome measurement error, treatment effects, and selection bias from historical decision-making policies. We develop an unbiased risk minimization method which, given knowledge of proxy measurement error properties, corrects for the combined effects of these challenges. We also develop a method for estimating treatment-dependent measurement error parameters when these are unknown in advance. We demonstrate the utility of our approach theoretically and via experiments on real-world data from randomized controlled trials conducted in healthcare and employment domains. As importantly, we demonstrate that models correcting for outcome measurement error or treatment effects alone suffer from considerable reliability limitations. Our work underscores the importance of considering intersectional threats to model validity during the design and evaluation of predictive models for decision support.
Machine Learning,Computers and Society,Human-Computer Interaction,Methodology
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to solve the model reliability problems caused by outcome measurement error (OME), treatment effects and selection bias in prediction models. Specifically, the paper focuses on the fact that in fields such as healthcare, employment and criminal justice, the labels that prediction models usually target cannot perfectly reflect the real outcomes of interest to experts and policymakers. For example, clinical risk assessment usually predicts the use of medical resources (such as cost, hospitalization rate) as a proxy for patients' medical needs, but these proxy indicators may be affected by systematic measurement errors. However, previous studies have ignored the fact that the decision - making itself is often a risk - mitigation intervention when modeling outcome measurement error, which will affect the target outcome and its recorded proxy indicators. Therefore, in these cases, solving the measurement error requires counterfactual modeling of treatment effects. The paper proposes an unbiased risk minimization method, which corrects the combined effects of these challenges under the premise of known proxy measurement error characteristics. In addition, the paper also develops a method for estimating treatment - dependent measurement error parameters when they are unknown in advance. ### Main contributions 1. **Problem modeling**: The paper proposes a problem - formulation method for modeling the interactions between OME, treatment effects and selection bias. 2. **Counterfactual modeling method**: A new method is developed for learning counterfactual models in the presence of OME, and a flexible method is provided to estimate the measurement error rate that is unknown in advance. 3. **Experimental verification**: Through experiments on synthetic data and semi - synthetic data (randomized controlled trials from the healthcare and employment fields), the effectiveness of the proposed method is verified, and the reliability problems of models that only correct OME or treatment effects are emphasized. ### Background and related work - **AI functionality and effectiveness problems**: Previous studies have evaluated specific modeling problems in detail and synthesized them into a broader criticism of AI effectiveness and functionality. - **Outcome measurement error**: Modeling outcome measurement error is challenging because it introduces two sources of uncertainty: what is a reasonable error model for a given proxy indicator, and what are the specific error parameters between the target outcome and the proxy outcome under the assumed measurement model. - **Counterfactual prediction**: Recent studies have shown that when the decision of a prediction model is a risk - mitigation intervention, counterfactual modeling is required. Based on this, the paper further argues that when modeling OME, the influence of treatment effects on the target outcome and its observed proxy outcome needs to be considered. ### Methods - **Unbiased risk minimization**: The paper proposes an unbiased risk minimization method. This method estimates the probability of the target potential outcome through a re - weighted alternative loss function when the measurement error parameters are known. - **Error parameter estimation**: A conditional class probability estimation (CCPE) method is developed to estimate error parameters from observational data. ### Experiments - **Synthetic data experiment**: A control evaluation is carried out on synthetic data to fully observe the ground truth of potential outcomes. - **Semi - synthetic data experiment**: A semi - synthetic evaluation is carried out using real data from randomized controlled trials in the healthcare and employment fields to better reflect the ecological settings of real - world deployment. ### Conclusions Through theoretical analysis and experimental verification, the paper shows that in the presence of OME, treatment effects and selection bias, it is not enough to correct only one of these factors. The research emphasizes the importance of carefully considering measurement assumptions when designing and evaluating prediction models.