Privacy preserving and secure robust federated learning: A survey

Qingdi Han,Siqi Lu,Wenhao Wang,Haipeng Qu,Jingsheng Li,Yang Gao
DOI: https://doi.org/10.1002/cpe.8084
2024-03-20
Concurrency and Computation Practice and Experience
Abstract:Summary Federated learning (FL) has emerged as a promising solution to address the challenges posed by data silos and the need for global data fusion. It offers a distributed machine learning framework with privacy‐preserving features, allowing model training without the need to collect user data. However, FL also presents significant security and privacy threats that hinder its widespread adoption. The requirements of privacy and security in FL are inherently conflicting. Privacy necessitates the concealment of individual client updates, while security requires the disclosure of client updates to detect anomalies. While most existing research focused on the privacy and security aspects of FL, very few studies have addressed the compatibility of these two demands. In this work, we aim to bridge this gap by proposing a comprehensive defense scheme that ensures privacy, security, and compatibility in FL. We categorize the existing literature into two key directions: privacy defense and security defense. Privacy defense includes methods based on additive masks, differential privacy, homomorphic encryption, and trusted execution environment, whereas security defense encompasses distance‐, performance‐, clustering‐, and similarity‐based anomaly detection techniques and statistical information‐based anomaly update bypassing techniques when the server is trusted and privacy‐compatible anomaly update detection techniques when the server is not trusted. In addition, this article presents decentralized FL solutions based on blockchain. For each direction, we discuss specific technical solutions, their advantages, and disadvantages. By evaluating various defense methods, we identify the most suitable approach to address the primary challenge of "achieving a secure and robust FL system against malicious adversaries while protecting users' privacy." We then propose a theoretical reference framework for end‐to‐end protection of privacy and security in FL for the key problem, which summarizes the attack surface of FL systems from the client to the server under the security model where the client and server are malicious. Leveraging the strengths and characteristics of existing schemes, our proposed framework integrates multiple techniques to strike a balance between privacy, usability, and efficiency. This framework serves as a valuable reference and provides insights for future work in the field. Finally, we also provide recommendations for future research directions in this field.
computer science, theory & methods, software engineering
What problem does this paper attempt to address?