Differentially Private Fair Learning

Matthew Jagielski,Michael Kearns,Jieming Mao,Alina Oprea,Aaron Roth,Saeed Sharifi-Malvajerdi,Jonathan Ullman
DOI: https://doi.org/10.48550/arXiv.1812.02696
2019-06-01
Abstract:Motivated by settings in which predictive models may be required to be non-discriminatory with respect to certain attributes (such as race), but even collecting the sensitive attribute may be forbidden or restricted, we initiate the study of fair learning under the constraint of differential privacy. We design two learning algorithms that simultaneously promise differential privacy and equalized odds, a 'fairness' condition that corresponds to equalizing false positive and negative rates across protected groups. Our first algorithm is a private implementation of the equalized odds post-processing approach of [Hardt et al., 2016]. This algorithm is appealingly simple, but must be able to use protected group membership explicitly at test time, which can be viewed as a form of 'disparate treatment'. Our second algorithm is a differentially private version of the oracle-efficient in-processing approach of [Agarwal et al., 2018] that can be used to find the optimal fair classifier, given access to a subroutine that can solve the original (not necessarily fair) learning problem. This algorithm is more complex but need not have access to protected group membership at test time. We identify new tradeoffs between fairness, accuracy, and privacy that emerge only when requiring all three properties, and show that these tradeoffs can be milder if group membership may be used at test time. We conclude with a brief experimental evaluation.
Machine Learning,Data Structures and Algorithms,Computer Science and Game Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to train non - discriminatory prediction models while protecting sensitive attributes (such as race). Specifically, researchers are concerned with how to design learning algorithms that satisfy both differential privacy (DP) and fairness conditions (such as Equalized Odds) during the data collection process, even when collecting sensitive attributes is not allowed or restricted. This involves ensuring that the model does not discriminate against specific groups without directly using protected attributes (such as race). The paper proposes two learning algorithms to achieve differential privacy and fairness simultaneously: 1. **Differential Privacy Post - processing Method** (DP - postprocessing): This is a privatized implementation of the post - processing method proposed in [Hardt et al., 2016]. This method adjusts the model's prediction results by explicitly using the protected group membership at test time to meet the fairness condition of Equalized Odds. Although this method is simple and easy to implement, it requires access to protected group membership information at test time, which may be infeasible or illegal in some applications. 2. **Differential Privacy Pre - processing Method** (DP - oracle - learner): This is a differential privacy version of the pre - processing method based on [Agarwal et al., 2018], which can find the optimal fair classifier without accessing protected group membership information. This method is more complex, but does not require access to protected group membership information at test time. The paper also explores the new trade - offs that emerge when differential privacy, accuracy, and fairness are required simultaneously, and shows that these trade - offs may become more moderate when group membership information is allowed to be used at test time. Finally, the paper experimentally evaluates the performance of the proposed algorithms.