Promoting Data and Model Privacy in Federated Learning through Quantized LoRA

JianHao Zhu,Changze Lv,Xiaohua Wang,Muling Wu,Wenhao Liu,Tianlong Li,Zixuan Ling,Cenyuan Zhang,Xiaoqing Zheng,Xuanjing Huang
2024-06-16
Abstract:Conventional federated learning primarily aims to secure the privacy of data distributed across multiple edge devices, with the global model dispatched to edge devices for parameter updates during the learning process. However, the development of large language models (LLMs) requires substantial data and computational resources, rendering them valuable intellectual properties for their developers and owners. To establish a mechanism that protects both data and model privacy in a federated learning context, we introduce a method that just needs to distribute a quantized version of the model's parameters during training. This method enables accurate gradient estimations for parameter updates while preventing clients from accessing a model whose performance is comparable to the centrally hosted one. Moreover, we combine this quantization strategy with LoRA, a popular and parameter-efficient fine-tuning method, to significantly reduce communication costs in federated learning. The proposed framework, named \textsc{FedLPP}, successfully ensures both data and model privacy in the federated learning context. Additionally, the learned central model exhibits good generalization and can be trained in a resource-efficient manner.
Machine Learning,Computation and Language,Cryptography and Security
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily focuses on the issues of data and model privacy protection in Federated Learning (FL). Specifically: 1. **Data Privacy Protection**: Traditional federated learning mainly focuses on training models without aggregating raw data to a central server, thereby protecting the data privacy distributed across various clients. 2. **Model Privacy Protection**: With the development of large language models (LLMs), these models themselves have become important intellectual property that needs protection from unauthorized access. For example, when LLMs are commercialized as paid services, unauthorized access can harm the interests of developers and owners. The paper proposes a new framework called FEDLPP (FL with LLM Privacy Protection), which achieves simultaneous protection of data and models through quantization techniques and Parameter-Efficient Fine-Tuning (PEFT). The specific approach is as follows: - In each communication round, the central server only sends a quantized version of the global LoRA parameters to the clients, rather than the complete model parameters. - By combining LoRA technology, it not only reduces communication costs but also ensures that clients cannot obtain a model with performance comparable to the central server. - A secure aggregation algorithm is used to ensure data privacy protection. Experimental results show that FEDLPP maintains good performance while protecting both data and model privacy, and it is applicable to federated learning scenarios of different scales.