FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning System

Weizhao Jin,Yuhang Yao,Shanshan Han,Jiajun Gu,Carlee Joe-Wong,Srivatsan Ravi,Salman Avestimehr,Chaoyang He
2024-06-17
Abstract:Federated Learning trains machine learning models on distributed devices by aggregating local model updates instead of local data. However, privacy concerns arise as the aggregated local models on the server may reveal sensitive personal information by inversion attacks. Privacy-preserving methods, such as homomorphic encryption (HE), then become necessary for FL training. Despite HE's privacy advantages, its applications suffer from impractical overheads, especially for foundation models. In this paper, we present FedML-HE, the first practical federated learning system with efficient HE-based secure model aggregation. FedML-HE proposes to selectively encrypt sensitive parameters, significantly reducing both computation and communication overheads during training while providing customizable privacy preservation. Our optimized system demonstrates considerable overhead reduction, particularly for large foundation models (e.g., ~10x reduction for ResNet-50, and up to ~40x reduction for BERT), demonstrating the potential for scalable HE-based FL deployment.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
The paper primarily focuses on addressing the privacy protection issues in Federated Learning (FL), particularly the problem of potential leakage of sensitive personal information when aggregating local models on the server. To tackle this challenge, the paper proposes the FedML-HE system, an efficient privacy-preserving federated learning scheme based on Homomorphic Encryption (HE). The paper points out that in standard federated learning systems, although local data is not directly shared with the central server, local models or model updates may still be vulnerable to threats such as data reconstruction attacks or gradient inversion attacks, thereby leaking users' sensitive information. Existing defense methods like Differential Privacy (DP) and Secure Aggregation can mitigate these issues to some extent, but they either degrade model performance or require additional interactive synchronization steps and are sensitive to client dropouts. To address the above issues, the key contributions of the paper are as follows: 1. **FedML-HE System**: This is a practical, homomorphic encryption-based privacy-preserving federated learning system that supports encrypted key management, encrypted federated learning platform deployment, and encrypted optimization to reduce overhead, and is designed specifically for efficient base model federated training. 2. **Selective Parameter Encryption**: This is a method of selectively encrypting the most sensitive parameters, aiming to minimize the size of encrypted model updates while providing customizable privacy protection. 3. **Theoretical Privacy Analysis**: It is demonstrated that the proposed HE system can ensure privacy under single-key and threshold adversaries, and encrypting most sensitive parameters can provide orders of magnitude better privacy guarantees. 4. **Experimental Results**: Extensive experiments show that the optimized system can significantly reduce overhead while maintaining the ability to defend against state-of-the-art machine learning privacy attacks, especially in HE federated training on large models (e.g., ResNet-50 and BERT), with overhead reductions of approximately 10 times and up to 40 times, respectively. Overall, the goal of the FedML-HE system is to enhance the security and privacy protection capabilities of federated learning through efficient homomorphic encryption technology, making it suitable for real-world application scenarios.