Abstract:Most of the ML datasets we use today are biased. When we train models on these biased datasets, they often not only learn dataset biases but can also amplify them -- a phenomenon known as bias amplification. Several co-occurrence-based metrics have been proposed to measure bias amplification between a protected attribute A (e.g., gender) and a task T (e.g., cooking). However, these metrics fail to measure biases when A is balanced with T. To measure bias amplification in balanced datasets, recent work proposed a predictability-based metric called leakage amplification. However, leakage amplification cannot identify the direction in which biases are amplified. In this work, we propose a new predictability-based metric called directional predictability amplification (DPA). DPA measures directional bias amplification, even for balanced datasets. Unlike leakage amplification, DPA is easier to interpret and less sensitive to attacker models (a hyperparameter in predictability-based metrics). Our experiments on tabular and image datasets show that DPA is an effective metric for measuring directional bias amplification. The code will be available soon.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to measure and interpret the bias amplification phenomenon in balanced datasets. Specifically, the existing bias amplification measurement methods have limitations when dealing with balanced datasets. They are unable to identify the direction of bias amplification and are very sensitive to the choice of attacker models. Therefore, the author proposes a new predictability measurement method - Directional Predictability Amplification (DPA) - to overcome these limitations.
### Specific description of the problem
1. **Limitations of existing methods**:
- **Lack of directionality**: The existing leakage amplification methods cannot distinguish the direction of bias amplification, for example, whether it is from the protected attribute \(A\) to the task \(T\), or from the task \(T\) to the protected attribute \(A\).
- **Unbounded values**: Leakage amplification does not have a fixed value range, which makes its results difficult to interpret.
- **Relative changes not considered**: Leakage amplification only calculates the absolute change and ignores the relative change of bias in the dataset.
- **Sensitivity to attacker models**: Different choices of attacker models will lead to different leakage amplification values.
2. **Challenges in balanced datasets**:
- In balanced datasets, the co - occurrence frequencies of the protected attribute \(A\) and the task \(T\) are equal, but there may still be biases caused by unlabeled parts.
- Existing methods may misreport as no bias amplification in this case.
### Solution
The author proposes Directional Predictability Amplification (DPA), which has the following advantages:
- **Directionality**: DPA can measure the direction of bias amplification, that is, \(A\rightarrow T\) or \(T\rightarrow A\).
- **Bounded values**: The value range of DPA is fixed in \((- 1,1)\), which is convenient for interpretation.
- **Relative changes**: DPA takes into account the relative change of the original bias in the dataset.
- **Insensitivity to attacker models**: DPA is relatively robust to the choice of attacker models.
### Mathematical formula representation
The definition of DPA is as follows:
For the bias amplification in the direction of \(A\rightarrow T\):
\[ \text{DPA}_{A\rightarrow T}=\frac{\Psi_{M, A\rightarrow T}-\Psi_{D, A\rightarrow T}}{\Psi_{M, A\rightarrow T}+\Psi_{D, A\rightarrow T}} \]
For the bias amplification in the direction of \(T\rightarrow A\):
\[ \text{DPA}_{T\rightarrow A}=\frac{\Psi_{M, T\rightarrow A}-\Psi_{D, T\rightarrow A}}{\Psi_{M, T\rightarrow A}+\Psi_{D, T\rightarrow A}} \]
where:
- \( \Psi_{D, A\rightarrow T}=Q(f^T_D(A), T) \)
- \( \Psi_{M, A\rightarrow T}=Q(f^T_M(A), \hat{T}) \)
- \( \Psi_{D, T\rightarrow A}=Q(f^A_D(T), A) \)
- \( \Psi_{M, T\rightarrow A}=Q(f^A_M(T), \hat{A}) \)
Here, \( f^A_D\) and \( f^A_M\) are attacker models for predicting the protected attribute \(A\), based on the true labels \(T\) of the dataset and the model prediction \(\hat{T}\) respectively; \( f^T_D\) and \( f^T_M\) are attacker models for predicting the task \(T\), based on the true labels \(A\) of the dataset and the model prediction \(\hat{A}\) respectively.
In this way, DPA can not only measure the bias.