Abstract:Machine learning systems are notoriously prone to biased predictions about certain demographic groups, leading to algorithmic fairness issues. Due to privacy concerns and data quality problems, some demographic information may not be available in the training data and the complex interaction of different demographics can lead to a lot of unknown minority subpopulations, which all limit the applicability of group fairness. Many existing works on fairness without demographics assume the correlation between groups and features. However, we argue that the model gradients are also valuable for fairness without demographics. In this paper, we show that the correlation between gradients and groups can help identify and improve group fairness. With an adversarial weighting architecture, we construct a graph where samples with similar gradients are connected and learn the weights of different samples from it. Unlike the surrogate grouping methods that cluster groups from features and labels as proxy sensitive attribute, our method leverages the graph structure as a soft grouping mechanism, which is much more robust to noises. The results show that our method is robust to noise and can improve fairness significantly without decreasing the overall accuracy too much.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that machine - learning systems are prone to bias when predicting certain population groups, leading to algorithm fairness issues. Specifically, due to privacy issues and data - quality problems, certain demographic information may not be obtainable in the training data, and the complex interactions between different demographic data will lead to many unknown minority subgroups, which limits the application of group - based fairness methods. In addition, some existing fairness methods that do not rely on demographic data assume an association between groups and features, but these methods are deficient in dealing with noise and unknown subgroups. ### Main contributions of the paper 1. **Proposed a new fairness algorithm without sensitive attributes**: This algorithm identifies and improves group fairness by learning the Graph of Gradients (GoG) without directly using sensitive attributes (such as race, gender, etc.). This method can be scaled on large - scale datasets and is applicable to various complex structures and domains. 2. **Proved that model gradients represent demographic groups more effectively than input features**: As long as the performance of the deep - learning model is better than random guessing, the gradients can better represent demographic information. The paper also proved that the last - layer gradients are sufficient for representing demographic information. 3. **Introduced a soft - grouping mechanism based on the gradient graph**: By constructing a gradient graph, in which each sample is connected to its K - nearest - neighbor samples, the sample weights are calculated. This soft - grouping mechanism avoids the problems brought by hard - boundary grouping and is more robust to noise and outliers. ### Method overview - **Problem definition**: Consider data \((x, y, s)\), where \(x\) represents non - sensitive features, \(y\) represents labels, and \(s\) represents sensitive attributes. Given \(x\), \(y\) needs to be predicted without relying on \(s\), while meeting certain fairness criteria. - **Gradient definition**: Gradients not only provide information about model training but also contain information about data bias. The paper defines the undirected gradient \(g\in\mathbb{R}^{D\times M}\): \[ g_{d,j}=z_d|\hat{y}_j - y_j| \] where \(z_d\) is the latent representation of non - sensitive features, and \(\hat{y}_j - y_j\) is the prediction error of the label category. - **Theoretical analysis**: It is proved from the perspective of information theory that the mutual information between the gradient (combined with input features and model error) and the sensitive attribute is greater than the mutual information between the input feature and the sensitive attribute. The specific formula is as follows: \[ I(xU|s)>I(x|s) \] - **Framework design**: An adversarial learning framework is proposed, including a main - task learner and an adversarial network that generates sample weights. Sample weights are constructed through the gradient graph, so that samples with similar gradients are connected to each other and the weights are aggregated. ### Experimental verification The paper conducted extensive experiments on three public datasets and verified the significant advantages of this method in terms of fairness and accuracy. The experimental results show that this method can not only improve fairness but also maintain high overall accuracy and is highly robust to noise. ### Summary By introducing the method of the gradient graph, this paper solves the fairness problem in machine learning that does not rely on demographic data, providing a new idea and technical means, which has important theoretical and practical significance.

Fairness without Demographics through Learning Graph of Gradients