FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher

Alessio Mora,Lorenzo Valerio,Paolo Bellavista,Andrea Passarella
2024-08-14
Abstract:Federated Learning (FL) promises better privacy guarantees for individuals' data when machine learning models are collaboratively trained. When an FL participant exercises its right to be forgotten, i.e., to detach from the FL framework it has participated and to remove its past contributions to the global model, the FL solution should perform all the necessary steps to make it possible without sacrificing the overall performance of the global model, which is not supported in state-of-the-art related solutions nowadays. In this paper, we propose FedQUIT, a novel algorithm that uses knowledge distillation to scrub the contribution of the forgetting data from an FL global model while preserving its generalization ability. FedQUIT directly works on clients' devices and does not require sharing additional information if compared with a regular FL process, nor does it assume the availability of publicly available proxy data. Our solution is efficient, effective, and applicable in both centralized and federated settings. Our experimental results show that, on average, FedQUIT requires less than 2.5% additional communication rounds to recover generalization performances after unlearning, obtaining a sanitized global model whose predictions are comparable to those of a global model that has never seen the data to be forgotten.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
This paper attempts to address the need for data forgetting (Right to be Forgotten) in Federated Learning (FL). Specifically, when a federated learning participant wishes to withdraw from the federated learning framework and delete their past contributions to the global model, existing federated learning solutions cannot achieve this without sacrificing the overall performance of the global model. The paper proposes a new algorithm, FedQUIT, which uses Knowledge Distillation (KD) technology to erase the contribution of specific data from the global model while maintaining the model's generalization ability. ### Main Issues: 1. **Implementation of Data Forgetting**: How to effectively implement data forgetting in a federated learning environment, i.e., how to completely remove a user's contribution from the global model after they request their data to be deleted. 2. **Performance Maintenance**: How to ensure that the overall performance of the global model is not affected while implementing data forgetting, especially maintaining the generalization ability. 3. **Privacy Protection**: How to ensure that the privacy data of other participants is not leaked throughout the process, especially achieving data forgetting without sharing additional information. ### Background and Motivation: - **GDPR Regulation**: The European Union's General Data Protection Regulation (GDPR) grants individuals control over their personal data, including the "Right to be Forgotten," which allows individuals to request the deletion of their data. - **Advantages of Federated Learning**: Federated learning trains models on local devices, avoiding the transmission of sensitive data, thus providing better privacy protection. - **Limitations of Existing Methods**: Existing data forgetting methods usually require retraining the model or relying on additional datasets, which is not feasible in a federated learning environment because clients may be offline for long periods, and datasets may no longer be available. ### Solution: - **FedQUIT Algorithm**: This algorithm operates directly on client devices using knowledge distillation technology, without the need for additional datasets or historical update records. Specifically, FedQUIT uses a specially designed global model as a teacher model to guide the local model (student model) in data forgetting. - **Two Specific Methods**: - **FedQUIT-Logits**: Achieves data forgetting by modifying the logits output of the global model. - **FedQUIT-Softmax**: Achieves data forgetting by modifying the softmax output of the global model. ### Experimental Results: - **Centralized Environment**: Experiments on the CIFAR-Super20 dataset show that the FedQUIT method can quickly restore the model's generalization performance after data forgetting, with performance comparable to a retrained model. - **Federated Environment**: Experiments on the CIFAR-10 and CIFAR-100 datasets show that the FedQUIT method can effectively achieve data forgetting in a federated learning environment, requiring less than 2.5% additional communication rounds to restore the model's performance. ### Conclusion: The FedQUIT algorithm provides an efficient and effective method for data forgetting in federated learning, meeting users' "Right to be Forgotten" requirements without sacrificing model performance, while fully complying with the privacy protection design of federated learning.