Abstract:Fairness measurement is crucial for assessing algorithmic bias in various types of machine learning (ML) models, including ones used for search relevance, recommendation, personalization, talent analytics, and natural language processing. However, the fairness measurement paradigm is currently dominated by fairness metrics that examine disparities in allocation and/or prediction error as univariate key performance indicators (KPIs) for a protected attribute or group. Although important and effective in assessing ML bias in certain contexts such as recidivism, existing metrics don’t work well in many real-world applications of ML characterized by imperfect models applied to an array of instances encompassing a multivariate mixture of protected attributes, that are part of a broader process pipeline. Consequently, the upstream representational harm quantified by existing metrics based on how the model represents protected groups doesn’t necessarily relate to allocational harm in the application of such models in downstream policy/decision contexts. We propose FAIR-Frame, a model-based framework for parsimoniously modeling fairness across multiple protected attributes in regard to the representational and allocational harm associated with the upstream design/development and downstream usage of ML models. We evaluate the efficacy of our proposed framework on two testbeds pertaining to text classification using pretrained language models. The upstream testbeds encompass over fifty thousand documents associated with twenty-eight thousand users, seven protected attributes and five different classification tasks. The downstream testbeds span three policy outcomes and over 5.41 million total observations. Results in comparison with several existing metrics show that the upstream representational harm measures produced by FAIR-Frame and other metrics are significantly different from one another, and that FAIR-Frame’s representational fairness measures have the highest percentage alignment and lowest error with allocational harm observed in downstream applications. Our findings have important implications for various ML contexts, including information retrieval, user modeling, digital platforms, and text classification, where responsible and trustworthy AI are becoming an imperative.

An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers

Metrics and methods for a systematic comparison of fairness-aware machine learning algorithms

Data vs. Model Machine Learning Fairness Testing: An Empirical Study

Analyzing Fairness of Computer Vision and Natural Language Processing Models

Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML

A Comparative Study of Fairness in Medical Machine Learning.

Fairer and more accurate, but for whom?

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness

Fix Fairness, Don't Ruin Accuracy: Performance Aware Fairness Repair using AutoML

Fairness And Performance In Harmony: Data Debiasing Is All You Need

Analyzing Fairness of Classification Machine Learning Model with Structured Dataset

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

A novel approach for assessing fairness in deployed machine learning algorithms

Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions

A Simulation Based Dynamic Evaluation Framework for System-wide Algorithmic Fairness

A Survey on Bias and Fairness in Machine Learning

Should Fairness be a Metric or a Model? A Model-based Framework for Assessing Bias in Machine Learning Pipelines

Fairness Deconstructed: A Sociotechnical View of 'Fair' Algorithms in Criminal Justice

Fairness for machine learning software in education: A systematic mapping study