Abstract:Although machine learning algorithms demonstrate impressive performance, their trustworthiness remains a critical issue, particularly concerning fairness when implemented in real-world applications. Many notions of group fairness aim to minimize disparities in performance across protected groups. However, it can inadvertently reduce performance in certain groups, leading to sub-optimal outcomes. In contrast, Min-max group fairness notion prioritizes the improvement for the worst-performing group, thereby advocating a utility-promoting approach to fairness. However, it has been proven that existing efforts to achieve Min-max fairness exhibit limited effectiveness. In response to this challenge, we leverage the recently proposed "Neural Collapse'' framework to re-examine Empirical Risk Minimization (ERM) training, specifically investigating the root causes of poor performance in minority groups. The layer-peeled model is employed to decompose a network into two parts: an encoder to learn latent representation, and a subsequent classifier, with a systematic characterization of their training behaviors being conducted. Our analysis reveals that while classifiers achieve maximum separation, the separability of representations is insufficient, particularly for minority groups. This indicates the sub-optimal performance in minority groups stems from less separable representations, rather than classifiers. To tackle this issue, we introduce a novel strategy that incorporates a frozen classifier to directly enhance representation. Furthermore, we introduce two easily implemented loss functions to guide the learning process. The experimental assessments carried out on real-world benchmark datasets spanning the domains of Computer Vision, Natural Language Processing, and Tabular data demonstrate that our approach outperforms existing state-of-the-art methods in promoting the Min-max fairness notion.

Learning fair representations via an adversarial framework

Fairness via Adversarial Attribute Neighbourhood Robust Learning

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

Learning Fair Classifiers via Min-Max F-divergence Regularization

Neural Collapse Inspired Debiased Representation Learning for Min-max Fairness

Towards Fairness-Aware Adversarial Learning

Fairness with Adaptive Weights.

Learning fair representation with a parametric integral probability metric

Fair Representation Learning through Implicit Path Alignment.

Unfairness Despite Awareness: Group-Fair Classification with Strategic Agents

Automatic Fairness Testing of Neural Classifiers through Adversarial Sampling

Self-Supervised Fair Representation Learning without Demographics

Fair Classification with Noisy Protected Attributes: A Framework with Provable Guarantees

Learning Fair and Interpretable Representations via Linear Orthogonalization

Estimating and Improving Fairness with Adversarial Learning

Identifying, measuring, and mitigating individual unfairness for supervised learning models and application to credit risk models

Fairness-aware Adversarial Perturbation Towards Bias Mitigation for Deployed Deep Models

Fairness in Machine Learning with Tractable Models

Fair Representation Learning: an Alternative to Mutual Information

Fair Supervised Learning with A Simple Random Sampler of Sensitive Attributes

Fair Inference for Discrete Latent Variable Models