Abstract:Although machine learning algorithms demonstrate impressive performance, their trustworthiness remains a critical issue, particularly concerning fairness when implemented in real-world applications. Many notions of group fairness aim to minimize disparities in performance across protected groups. However, it can inadvertently reduce performance in certain groups, leading to sub-optimal outcomes. In contrast, Min-max group fairness notion prioritizes the improvement for the worst-performing group, thereby advocating a utility-promoting approach to fairness. However, it has been proven that existing efforts to achieve Min-max fairness exhibit limited effectiveness. In response to this challenge, we leverage the recently proposed "Neural Collapse'' framework to re-examine Empirical Risk Minimization (ERM) training, specifically investigating the root causes of poor performance in minority groups. The layer-peeled model is employed to decompose a network into two parts: an encoder to learn latent representation, and a subsequent classifier, with a systematic characterization of their training behaviors being conducted. Our analysis reveals that while classifiers achieve maximum separation, the separability of representations is insufficient, particularly for minority groups. This indicates the sub-optimal performance in minority groups stems from less separable representations, rather than classifiers. To tackle this issue, we introduce a novel strategy that incorporates a frozen classifier to directly enhance representation. Furthermore, we introduce two easily implemented loss functions to guide the learning process. The experimental assessments carried out on real-world benchmark datasets spanning the domains of Computer Vision, Natural Language Processing, and Tabular data demonstrate that our approach outperforms existing state-of-the-art methods in promoting the Min-max fairness notion.

Less is more: Selecting informative and diverse subsets with balancing constraints

Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision

Beyond Size and Class Balance: Alpha as a New Dataset Quality Metric for Deep Learning

On Distributed Larger-Than-Memory Subset Selection With Pairwise Submodular Functions

Embrace Sustainable AI: Dynamic Data Subset Selection for Image Classification

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

Learning by Grouping: A Multilevel Optimization Framework for Improving Fairness in Classification without Losing Accuracy

Data Summarization via Bilevel Optimization

Selection via Proxy: Efficient Data Selection for Deep Learning

Finding High-Value Training Data Subset through Differentiable Convex Programming

Statistical Undersampling with Mutual Information and Support Points

Optimal Data Selection: An Online Distributed View

On Diversity in Discriminative Neural Networks

Faster Algorithms for Fair Max-Min Diversification in $\mathbb{R}^d$

On Benefits of Selection Diversity Via Bilevel Exclusive Sparsity.

Neural Collapse Inspired Debiased Representation Learning for Min-max Fairness

Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes

The More, the Better? Active Silencing of Non-Positive Transfer for Efficient Multi-Domain Few-Shot Classification

Most Influential Subset Selection: Challenges, Promises, and Beyond

Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation