Abstract:Machine learning (ML) is playing an increasingly important role in rendering decisions that affect a broad range of groups in society. ML models inform decisions in criminal justice, the extension of credit in banking, and the hiring practices of corporations. This posits the requirement of model fairness, which holds that automated decisions should be equitable with respect to protected features (e.g., gender, race, or age) that are often under-represented in the data. We postulate that this problem of under-representation has a corollary to the problem of imbalanced data learning. This class imbalance is often reflected in both classes and protected features. For example, one class (those receiving credit) may be over-represented with respect to another class (those not receiving credit) and a particular group (females) may be under-represented with respect to another group (males). A key element in achieving algorithmic fairness with respect to protected groups is the simultaneous reduction of class and protected group imbalance in the underlying training data, which facilitates increases in both model accuracy and fairness. We discuss the importance of bridging imbalanced learning and group fairness by showing how key concepts in these fields overlap and complement each other; and propose a novel oversampling algorithm, Fair Oversampling, that addresses both skewed class distributions and protected features. Our method: (i) can be used as an efficient pre-processing algorithm for standard ML algorithms to jointly address imbalance and group equity; and (ii) can be combined with fairness-aware learning algorithms to improve their robustness to varying levels of class imbalance. Additionally, we take a step toward bridging the gap between fairness and imbalanced learning with a new metric, Fair Utility, that combines balanced accuracy with fairness.

Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values

Fairness with Adaptive Weights.

Adaptive Priority Reweighing for Generalizing Fairness Improvement.

Explaining contributions of features towards unfairness in classifiers: A novel threshold-dependent Shapley value-based approach

Towards A Holistic View of Bias in Machine Learning: Bridging Algorithmic Fairness and Imbalanced Learning

FaiR-N: Fair and Robust Neural Networks for Structured Data

Explainability for fair machine learning

Algorithmic Fairness: Choices, Assumptions, and Definitions

FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes

Equitable Data Valuation Meets the Right to Be Forgotten in Model Markets

Improving Fairness for Data Valuation in Horizontal Federated Learning

FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions

Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions

CHG Shapley: Efficient Data Valuation and Selection towards Trustworthy Machine Learning

AIM: Attributing, Interpreting, Mitigating Data Unfairness

From Efficiency to Equity: Measuring Fairness in Preference Learning

CS-Shapley: Class-wise Shapley Values for Data Valuation in Classification

Towards Data Valuation via Asymmetric Data Shapley

Towards Fair and Calibrated Models

Alpha and Prejudice: Improving $α$-sized Worst-case Fairness via Intrinsic Reweighting

Data vs. Model Machine Learning Fairness Testing: An Empirical Study