Membership Information Leakage in Federated Contrastive Learning

Kongyang Chen,Wenfeng Wang,Zixin Wang,Wangjun Zhang,Zhipeng Li,Yao Huang
2024-03-07
Abstract:Federated Contrastive Learning (FCL) represents a burgeoning approach for learning from decentralized unlabeled data while upholding data privacy. In FCL, participant clients collaborate in learning a global encoder using unlabeled data, which can serve as a versatile feature extractor for diverse downstream tasks. Nonetheless, FCL is susceptible to privacy risks, such as membership information leakage, stemming from its distributed nature, an aspect often overlooked in current solutions. This study delves into the feasibility of executing a membership inference attack on FCL and proposes a robust attack methodology. The attacker's objective is to determine if the data signifies training member data by accessing the model's inference output. Specifically, we concentrate on attackers situated within a client framework, lacking the capability to manipulate server-side aggregation methods or discern the training status of other clients. We introduce two membership inference attacks tailored for FCL: the \textit{passive membership inference attack} and the \textit{active membership inference attack}, contingent on the attacker's involvement in local model training. Experimental findings across diverse datasets validate the effectiveness of our attacks and underscore the inherent privacy risks associated with the federated contrastive learning paradigm.
Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the member information leakage problem in Federated Contrastive Learning (FCL). Specifically, the author focuses on how to infer whether the data belongs to the training set through the model's inference output in the FCL framework, that is, to perform Membership Inference Attack (MIA). Such attacks may lead to the leakage of sensitive information, especially in the distributed learning environment, and this risk is particularly worthy of attention. ### Problem Background 1. **Federated Contrastive Learning (FCL)**: FCL is an emerging distributed unsupervised learning method that allows multiple clients to collaboratively train a global encoder using unlabeled data without sharing the original data. This encoder can be used as a feature extractor for various downstream tasks. 2. **Membership Inference Attack (MIA)**: MIA aims to determine whether a certain data sample has participated in the model training process. This attack may lead to privacy leakage, especially when dealing with sensitive data. ### Main Contributions of the Paper 1. **First study on member information leakage in FCL**: As far as the author knows, this is the first research to explore the member information leakage problem in FCL. 2. **Propose two membership inference attack methods**: - **Passive Membership Inference Attack**: The attacker does not participate in the training or interfere with the training process, but obtains the trained model parameters and launches an attack. - **Active Membership Inference Attack**: The attacker participates in the local model training, performs gradient ascent on the data to be inferred, and uploads the results to the server to observe the change in loss. 3. **Experimental verification**: The author evaluated the effectiveness of these attack methods on multiple datasets, including SVHN, CIFAR - 10, and CIFAR - 100. The experimental results show that these attack methods can indeed successfully infer member information. ### Specific Methods of Passive Membership Inference Attack 1. **Cosine - similarity - based attack**: Calculate the cosine similarity between each sample and all other samples, and use these similarity scores to train a binary classifier to distinguish between member and non - member data. 2. **Internal - model - based attack**: Use the encoder to predict labels (although there are no real labels), select the K data points with the highest predicted probability as input, and train a binary classifier. 3. **Feature - combination - based attack**: Combine cosine similarity, loss value, and the highest predicted probability to construct a three - dimensional feature vector and input it into the classifier for member/non - member classification. ### Specific Methods of Active Membership Inference Attack 1. **Gradient - ascent attack**: After completing the FCL training, obtain the model parameters and perform gradient ascent on the data to be inferred, and observe the change in data loss. 2. **Attack during the training process**: Assume that the attacker is one of the clients. It can adjust its own model and perform gradient ascent on the inference data, and then upload the modified model parameters to the server to observe the change in the loss or cosine similarity of the aggregated model for the inference data. ### Summary This paper reveals the risk of member information leakage in FCL and proposes two effective membership inference attack methods. These findings provide an important reference for improving the security and privacy protection of FCL in the future.