Counterfactually Fair Regression with Double Machine Learning

Patrick Rehill
2023-03-21
Abstract:Counterfactual fairness is an approach to AI fairness that tries to make decisions based on the outcomes that an individual with some kind of sensitive status would have had without this status. This paper proposes Double Machine Learning (DML) Fairness which analogises this problem of counterfactual fairness in regression problems to that of estimating counterfactual outcomes in causal inference under the Potential Outcomes framework. It uses arbitrary machine learning methods to partial out the effect of sensitive variables on nonsensitive variables and outcomes. Assuming that the effects of the two sets of variables are additively separable, outcomes will be approximately equalised and individual-level outcomes will be counterfactually fair. This paper demonstrates the approach in a simulation study pertaining to discrimination in workplace hiring and an application on real data estimating the GPAs of law school students. It then discusses when it is appropriate to apply such a method to problems of real-world discrimination where constructs are conceptually complex and finally, whether DML Fairness can achieve justice in these settings.
Machine Learning,Methodology
What problem does this paper attempt to address?
The problem this paper attempts to address is how to achieve counterfactual fairness in regression tasks, ensuring that machine learning models do not discriminate based on sensitive attributes (such as gender, race, etc.) when making predictions and decisions. Specifically, the authors propose a method called "Double Machine Learning Fairness" (DML Fairness), which aims to achieve counterfactual fairness by partially removing the influence of sensitive variables on non-sensitive variables and outcomes. Assuming that the influence of sensitive and non-sensitive variables on the target variable is additively separable, the outcomes will be approximately equalized, and individual-level outcomes will be counterfactually fair. The authors demonstrate this method through simulation studies and real data applications, discussing its applicability and limitations in the real world. Although DML Fairness may be a powerful tool, it is not a panacea for all problems without fully considering the specific application context.