Abstract:As they have a vital effect on social decision-making, AI algorithms should be not only accurate but also fair. Among various algorithms for fairness AI, learning fair representation (LFR), whose goal is to find a fair representation with respect to sensitive variables such as gender and race, has received much attention. For LFR, the adversarial training scheme is popularly employed as is done in the generative adversarial network type algorithms. The choice of a discriminator, however, is done heuristically without justification. In this paper, we propose a new adversarial training scheme for LFR, where the integral probability metric (IPM) with a specific parametric family of discriminators is used. The most notable result of the proposed LFR algorithm is its theoretical guarantee about the fairness of the final prediction model, which has not been considered yet. That is, we derive theoretical relations between the fairness of representation and the fairness of the prediction model built on the top of the representation (i.e., using the representation as the input). Moreover, by numerical experiments, we show that our proposed LFR algorithm is computationally lighter and more stable, and the final prediction model is competitive or superior to other LFR algorithms using more complex discriminators.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to improve the fairness of the final prediction model by using the parameterized Integral Probability Metric (IPM) during the process of learning Fair Representations (LFR), and provide theoretical guarantees. Specifically, the paper focuses on the following two main issues: 1. **The relationship between fair representation and the final prediction model**: - Although existing fair representation learning algorithms have achieved certain success in practice, they do not theoretically clarify how the fairness of the representation affects the fairness of the final prediction model. This issue is important because the ultimate goal of LFR is to construct a fair prediction model. - By introducing the parameterized IPM, especially using the sigmoid function family as a discriminator, the paper establishes a theoretical relationship between the fairness of the representation and the fairness of the final prediction model. 2. **The choice of discriminator**: - In existing LFR algorithms, the choice of discriminator is usually based on heuristic methods and lacks theoretical basis. This may lead to problems such as high computational complexity and model instability. - The paper proposes a specific parameterized discriminator family and provides theoretical guarantees, indicating that this choice can effectively control the fairness of the final prediction model while being more computationally efficient and stable. ### Main contributions 1. **Propose a new fair representation learning method**: - By developing a new adversarial training scheme based on the parameterized IPM, a simple but powerful fair representation learning method (sIPM - LFR) is proposed. 2. **Provide theoretical guarantees**: - Prove the theoretical relationship between the fairness of the representation and the fairness of the final prediction model, which has not been considered in existing research. 3. **Experimental verification**: - Verify the performance of the sIPM - LFR algorithm on multiple benchmark datasets through experiments. The results show that this algorithm is competitive or even superior to other existing LFR algorithms in prediction performance. ### Formula presentation - **Integral Probability Metric (IPM)**: \[ d_V(P_0, P_1)=\sup_{v\in V}\left|\int v(z)(dP_0(z)-dP_1(z))\right| \] where \(V\) is the discriminator function class from \(Z\) to \(\mathbb{R}\), and \(P_0\) and \(P_1\) are two probability measures. - **sigmoid IPM**: \[ V_{\text{sig}}=\left\{\sigma(\theta^{\top}x + \mu):\theta\in\mathbb{R}^m,\mu\in\mathbb{R}\right\} \] where \(\sigma(z)=(1 + \exp(-z))^{-1}\) is the sigmoid function. - **Fairness metric (DP - fairness)**: \[ \text{DP}_\phi(g)=\left|\mathbb{E}[\phi\circ g(X, S)\mid S = 0]-\mathbb{E}[\phi\circ g(X, S)\mid S = 1]\right| \] Through these formulas and theoretical analysis, the paper successfully solves the problem of how to ensure the fairness of the final prediction model when learning fair representations, and provides a more computationally efficient and stable solution.

Learning fair representation with a parametric integral probability metric

Learning fair representations via an adversarial framework

Metrizing Fairness

FAIRM: Learning invariant representations for algorithmic fairness and domain generalization with minimax optimality

Adaptive Fair Representation Learning for Personalized Fairness in Recommendations via Information Alignment

Fairness via Adversarial Attribute Neighbourhood Robust Learning

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

Inference for an Algorithmic Fairness-Accuracy Frontier

Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification

Privacy for Fairness: Information Obfuscation for Fair Representation Learning with Local Differential Privacy

Estimating and Improving Fairness with Adversarial Learning

Fair Inference for Discrete Latent Variable Models

Flexible Fairness-Aware Learning via Inverse Conditional Permutation

Learning Fair and Interpretable Representations via Linear Orthogonalization

FaiR-N: Fair and Robust Neural Networks for Structured Data

Towards Fairness-Aware Adversarial Learning

Fairness with Adaptive Weights.

To be Robust or to be Fair: Towards Fairness in Adversarial Training

Fair Representation Learning through Implicit Path Alignment.

RULER: Discriminative and Iterative Adversarial Training for Deep Neural Network Fairness.

Fairness in Machine Learning with Tractable Models