Abstract:In widely used neural network-based collaborative filtering models, users' history logs are encoded into latent embeddings that represent the users' preferences. In this setting, the models are capable of mapping users' protected attributes (e.g., gender or ethnicity) from these user embeddings even without explicit access to them, resulting in models that may treat specific demographic user groups unfairly and raise privacy issues. While prior work has approached the removal of a single protected attribute of a user at a time, multiple attributes might come into play in real-world scenarios. In the work at hand, we present AdvXMultVAE which aims to unlearn multiple protected attributes (exemplified by gender and age) simultaneously to improve fairness across demographic user groups. For this purpose, we couple a variational autoencoder (VAE) architecture with adversarial training (AdvMultVAE) to support simultaneous removal of the users' protected attributes with continuous and/or categorical values. Our experiments on two datasets, LFM-2b-100k and Ml-1m, from the music and movie domains, respectively, show that our approach can yield better results than its singular removal counterparts (based on AdvMultVAE) in effectively mitigating demographic biases whilst improving the anonymity of latent embeddings.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the neural - network - based collaborative filtering recommendation system, users' historical logs are encoded into latent embedding vectors, which can implicitly map users' protected attributes (such as gender, age, etc.), even if the model does not have direct access to these attributes. This phenomenon may lead to the model's unfair treatment of specific demographic groups and raise privacy issues.
Specifically, the paper proposes a method named **AdvXMultVAE**, aiming to simultaneously remove multiple protected attributes (such as gender and age) through adversarial training to improve fairness among different demographic groups and enhance the anonymity of latent embeddings. Different from previous work that only removed a single attribute, this method can handle multiple continuous or categorical protected attributes simultaneously.
### Key Problem Summary:
1. **Implicitly Encoded Sensitive Information**: The latent embeddings of the recommendation system may implicitly contain users' sensitive attributes (such as gender, age, etc.), leading to unfairness and privacy risks.
2. **Multi - Attribute Removal**: In real - world scenarios, multiple protected attributes may exist simultaneously, so a method that can remove multiple attributes simultaneously is required.
3. **Maintaining Recommendation Performance**: While removing sensitive attributes, ensure that the performance of the recommendation system does not significantly decline.
### Solutions:
- **Model Architecture**: Use the Variational Autoencoder (VAE) combined with adversarial training to construct the **AdvXMultVAE** model.
- **Adversarial Modules**: Introduce an adversarial module for each protected attribute. These modules attempt to predict the protected attributes from the latent embeddings, while the main model attempts to minimize the accuracy of these predictions.
- **Multi - Attribute Handling**: Simultaneously handle multiple protected attributes through multiple adversarial modules to achieve more comprehensive de - biasing.
### Experimental Verification:
The paper conducted experiments on two public datasets (LFM - 2b - 100k and Ml - 1m). The results show that **AdvXMultVAE** is superior to single - attribute removal methods in effectively alleviating demographic bias and improving anonymity, while maintaining competitive recommendation performance.
### Formula Representation:
- The loss function of the Variational Autoencoder:
\[
L_{MULT}=L_{REC}(g(z), x)-\beta L_{KL}(N(\mu, \sigma), N(0, I))
\]
where \(L_{REC}\) is the reconstruction loss, \(L_{KL}\) is the Kullback - Leibler divergence, and \(\beta\) is a hyperparameter that adjusts the regularization strength.
- The optimization objective of adversarial training:
\[
\argmin_{f, g,\{h_0,\ldots, h_k\}} L_{MULT}(x)+L_{advX}(x, P)
\]
where \(L_{advX}\) is the sum of the losses of all adversarial modules:
\[
L_{advX}=\sum_{k} L_{adv_k}(h_k(z), p_k)
\]
Through this method, the paper successfully solves the problem of multi - attribute removal in the recommendation system and improves the fairness and privacy protection ability of the model.