Abstract:Most of the ML datasets we use today are biased. When we train models on these biased datasets, they often not only learn dataset biases but can also amplify them -- a phenomenon known as bias amplification. Several co-occurrence-based metrics have been proposed to measure bias amplification between a protected attribute A (e.g., gender) and a task T (e.g., cooking). However, these metrics fail to measure biases when A is balanced with T. To measure bias amplification in balanced datasets, recent work proposed a predictability-based metric called leakage amplification. However, leakage amplification cannot identify the direction in which biases are amplified. In this work, we propose a new predictability-based metric called directional predictability amplification (DPA). DPA measures directional bias amplification, even for balanced datasets. Unlike leakage amplification, DPA is easier to interpret and less sensitive to attacker models (a hyperparameter in predictability-based metrics). Our experiments on tabular and image datasets show that DPA is an effective metric for measuring directional bias amplification. The code will be available soon.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to measure and interpret the bias amplification phenomenon in balanced datasets. Specifically, the existing bias amplification measurement methods have limitations when dealing with balanced datasets. They are unable to identify the direction of bias amplification and are very sensitive to the choice of attacker models. Therefore, the author proposes a new predictability measurement method - Directional Predictability Amplification (DPA) - to overcome these limitations. ### Specific description of the problem 1. **Limitations of existing methods**: - **Lack of directionality**: The existing leakage amplification methods cannot distinguish the direction of bias amplification, for example, whether it is from the protected attribute \(A\) to the task \(T\), or from the task \(T\) to the protected attribute \(A\). - **Unbounded values**: Leakage amplification does not have a fixed value range, which makes its results difficult to interpret. - **Relative changes not considered**: Leakage amplification only calculates the absolute change and ignores the relative change of bias in the dataset. - **Sensitivity to attacker models**: Different choices of attacker models will lead to different leakage amplification values. 2. **Challenges in balanced datasets**: - In balanced datasets, the co - occurrence frequencies of the protected attribute \(A\) and the task \(T\) are equal, but there may still be biases caused by unlabeled parts. - Existing methods may misreport as no bias amplification in this case. ### Solution The author proposes Directional Predictability Amplification (DPA), which has the following advantages: - **Directionality**: DPA can measure the direction of bias amplification, that is, \(A\rightarrow T\) or \(T\rightarrow A\). - **Bounded values**: The value range of DPA is fixed in \((- 1,1)\), which is convenient for interpretation. - **Relative changes**: DPA takes into account the relative change of the original bias in the dataset. - **Insensitivity to attacker models**: DPA is relatively robust to the choice of attacker models. ### Mathematical formula representation The definition of DPA is as follows: For the bias amplification in the direction of \(A\rightarrow T\): \[ \text{DPA}_{A\rightarrow T}=\frac{\Psi_{M, A\rightarrow T}-\Psi_{D, A\rightarrow T}}{\Psi_{M, A\rightarrow T}+\Psi_{D, A\rightarrow T}} \] For the bias amplification in the direction of \(T\rightarrow A\): \[ \text{DPA}_{T\rightarrow A}=\frac{\Psi_{M, T\rightarrow A}-\Psi_{D, T\rightarrow A}}{\Psi_{M, T\rightarrow A}+\Psi_{D, T\rightarrow A}} \] where: - \( \Psi_{D, A\rightarrow T}=Q(f^T_D(A), T) \) - \( \Psi_{M, A\rightarrow T}=Q(f^T_M(A), \hat{T}) \) - \( \Psi_{D, T\rightarrow A}=Q(f^A_D(T), A) \) - \( \Psi_{M, T\rightarrow A}=Q(f^A_M(T), \hat{A}) \) Here, \( f^A_D\) and \( f^A_M\) are attacker models for predicting the protected attribute \(A\), based on the true labels \(T\) of the dataset and the model prediction \(\hat{T}\) respectively; \( f^T_D\) and \( f^T_M\) are attacker models for predicting the task \(T\), based on the true labels \(A\) of the dataset and the model prediction \(\hat{A}\) respectively. In this way, DPA can not only measure the bias.

Making Bias Amplification in Balanced Datasets Directional and Interpretable

Men Also Do Laundry: Multi-Attribute Bias Amplification

Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making

Feature-Wise Bias Amplification

Towards A Holistic View of Bias in Machine Learning: Bridging Algorithmic Fairness and Imbalanced Learning

Mitigating Large Language Model Bias: Automated Dataset Augmentation and Prejudice Quantification

The Bias Amplification Paradox in Text-to-Image Generation

Are Bias Mitigation Techniques for Deep Learning Effective?

Analyzing and Mitigating Bias for Vulnerable Classes: Towards Balanced Representation in Dataset

An Effective Theory of Bias Amplification

Detecting Risk of Biased Output with Balance Measures

De-amplifying Bias from Differential Privacy in Language Model Fine-tuning

BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization

The Impact of Inference Acceleration Strategies on Bias of LLMs

Bipol: Multi-axes Evaluation of Bias with Explainability in Benchmark Datasets

Bias-variance Decomposition in Machine Learning-based Side-channel Analysis

Explaining Knock-on Effects of Bias Mitigation

Interpreting Bias in Large Language Models: A Feature-Based Approach

Deep-BIAS: Detecting Structural Bias using Explainable AI

Efficient Bias Mitigation Without Privileged Information

Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases