Bias-Tolerant Fair Classification

Yixuan Zhang,Feng Zhou,Zhidong Li,Yang Wang,Fang Chen
DOI: https://doi.org/10.48550/arXiv.2107.03207
2021-07-07
Abstract:The label bias and selection bias are acknowledged as two reasons in data that will hinder the fairness of machine-learning outcomes. The label bias occurs when the labeling decision is disturbed by sensitive features, while the selection bias occurs when subjective bias exists during the data sampling. Even worse, models trained on such data can inherit or even intensify the discrimination. Most algorithmic fairness approaches perform an empirical risk minimization with predefined fairness constraints, which tends to trade-off accuracy for fairness. However, such methods would achieve the desired fairness level with the sacrifice of the benefits (receive positive outcomes) for individuals affected by the bias. Therefore, we propose a Bias-TolerantFAirRegularizedLoss (B-FARL), which tries to regain the benefits using data affected by label bias and selection bias. B-FARL takes the biased data as input, calls a model that approximates the one trained with fair but latent data, and thus prevents discrimination without constraints required. In addition, we show the effective components by decomposing B-FARL, and we utilize the meta-learning framework for the B-FARL optimization. The experimental results on real-world datasets show that our method is empirically effective in improving fairness towards the direction of true but latent labels.
Machine Learning,Computers and Society
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the unfairness problems in machine - learning models during the training process caused by label bias and selection bias. Specifically: 1. **Label Bias**: When the annotation decision is influenced by sensitive features (such as gender, race, etc.), it will lead to label flipping. For example, in recruitment data, qualified candidates may be wrongly marked as unqualified. 2. **Selection Bias**: There is subjective bias in the data sampling process, resulting in an imbalance in the proportion of data of certain groups. For example, fewer positive - label instances are selected from protected groups, reducing the proportion of protected groups. These problems will cause the trained models to inherit or even exacerbate the discrimination in the data. Most existing algorithm fairness methods perform empirical risk minimization by adding predefined fairness constraints to the loss function, which often makes a trade - off between fairness and accuracy, sacrificing the interests of some individuals to meet the fairness requirements. To overcome these challenges, this paper proposes a new loss function - **Bias - Tolerant FAirRegularized Loss (B - FARL)**. The main objectives of this method are: - To train models using data with label bias and selection bias while restoring the individual interests lost due to bias. - To learn fairness directly from the data without explicit fairness constraints or estimating the noise rate. - To use the meta - learning framework to optimize hyper - parameters and improve efficiency. Through this method, the author hopes to improve the fairness of the model without sacrificing accuracy and be able to better handle bias problems in the real world.