Abstract:Recent discoveries have revealed that deep neural networks might behave in a biased manner in many real-world scenarios. For instance, deep networks trained on a large-scale face recognition dataset CelebA tend to predict blonde hair for females and black hair for males. Such biases not only jeopardize the robustness of models but also perpetuate and amplify social biases, which is especially concerning for automated decision-making processes in healthcare, recruitment, etc., as they could exacerbate unfair economic and social inequalities among different groups. Existing debiasing methods suffer from high costs in bias labeling or model re-training, while also exhibiting a deficiency in terms of elucidating the origins of biases within the model. To this respect, we propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases inherent in trained models. The FMD identifies biased attributes through an explicit counterfactual concept and quantifies the influence of data samples with influence functions. Moreover, we design a machine unlearning-based strategy to efficiently and effectively remove the bias in a trained model with a small counterfactual dataset. Experiments on the Colored MNIST, CelebA, and Adult Income datasets along with experiments with large language models demonstrate that our method achieves superior or competing accuracies compared with state-of-the-art methods while attaining significantly fewer biases and requiring much less debiasing cost. Notably, our method requires only a small external dataset and updating a minimal amount of model parameters, without the requirement of access to training data that may be too large or unavailable in practice.

Reducing Bias in Pre-trained Models by Tuning while Penalizing Change

Bias Mitigation in Fine-tuning Pre-trained Models for Enhanced Fairness and Efficiency

Fast Model Debias with Machine Unlearning

Intra-Processing Methods for Debiasing Neural Networks

Are Bias Mitigation Techniques for Deep Learning Effective?

FairFix: Enhancing Fairness of Pre-Trained Deep Neural Networks with Scarce Data Resources

Rethinking Common Assumptions to Mitigate Racial Bias in Face Recognition Datasets

Removing biased data to improve fairness and accuracy

Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis

Deep-Treat: Learning Optimal Personalized Treatments From Observational Data Using Neural Networks

Detecting and Reducing Bias in a High Stakes Domain

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models

Automated Bias Reduction in Deep Learning Based Melanoma Diagnosis using a Semi-Supervised Algorithm

Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement

Debiasing surgeon: fantastic weights and how to find them

Addressing Bias Through Ensemble Learning and Regularized Fine-Tuning

Efficient Bias Mitigation Without Privileged Information

Mitigating bias in artificial intelligence: Fair data generation via causal models for transparent and explainable decision-making

Exploring Biases and Prejudice of Facial Synthesis via Semantic Latent Space

Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases