A multifaceted survey on privacy preservation of federated learning: progress, challenges, and opportunities

Sanchita Saha,Ashlesha Hota,Arup Kumar Chattopadhyay,Amitava Nag,Sukumar Nandi
DOI: https://doi.org/10.1007/s10462-024-10766-7
IF: 9.588
2024-06-23
Artificial Intelligence Review
Abstract:Federated learning (FL) refers to a system of training and stabilizing local machine learning models at the global level by aggregating the learning gradients of the models. It reduces the concern of sharing the private data of participating entities for statistical analysis to be carried out at the server. It allows participating entities called clients or users to infer useful information from their raw data. As a consequence, the need to share their confidential information with any other entity or the central entity called server is eliminated. FL can be clearly interpreted as a privacy-preserving version of traditional machine learning and deep learning algorithms. However, despite this being an efficient distributed training scheme, the client's sensitive information can still be exposed to various security threats from the shared parameters. Since data has always been a major priority for any user or organization, this article is primarily concerned with discussing the significant problems and issues relevant to the preservation of data privacy and the viability and feasibility of several proposed solutions in the FL context. In this work, we conduct a detailed study on FL, the categorization of FL, the challenges of FL, and various attacks that can be executed to disclose the users' sensitive data used during learning. In this survey, we review and compare different privacy solutions for FL to prevent data leakage and discuss secret sharing (SS)-based security solutions for FL proposed by various researchers in concise form. We also briefly discuss quantum federated learning (QFL) and privacy-preservation techniques in QFL. In addition to these, a comparison and contrast of several survey works on FL is included in this work. We highlight the major applications based on FL. We discuss certain future directions pertaining to the open issues in the field of FL and finally conclude our work.
computer science, artificial intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to protect data privacy in Federated Learning (FL). Although Federated Learning reduces the need for direct sharing of private data by training models locally and only sharing model parameters, the sensitive information of participants may still face various security threats due to the shared parameters. Therefore, this paper mainly focuses on discussing important issues and challenges related to data privacy protection and evaluates the feasibility and effectiveness of several proposed solutions. Specifically, the main contributions of the paper include: 1. **Comprehensive overview**: The paper provides a comprehensive overview of Federated Learning, including its definition, framework, advantages, and aggregation methods, and describes its classification, such as horizontal Federated Learning, vertical Federated Learning, and federated transfer learning. 2. **Main challenges**: It studies in detail the main challenges of Federated Learning, including optimizing high communication costs, system heterogeneity, statistical heterogeneity, network bandwidth differences, physical component differences, available memory buffers, fairness issues, and data privacy issues from various network attacks. The paper also summarizes the latest research results and points out the research gaps that still need further research. 3. **Privacy - protection schemes**: It focuses on the applications of privacy - protection technologies such as differential privacy, secure multi - party computation, homomorphic encryption, and secret sharing in Federated Learning. 4. **Quantum Federated Learning**: It briefly outlines Quantum Federated Learning (QFL) and discusses the contributions of researchers in privacy protection. 5. **Blockchain and Federated Learning**: It explores the emerging research field of the combination of blockchain and Federated Learning and provides a privacy - protection framework. 6. **Application scenarios**: It discusses the applications of Federated Learning in multiple fields such as intelligent healthcare, smart cities and homes, self - driving cars, the insurance industry, image classification, recommendation systems, intelligent transportation, and national defense. 7. **Future research directions**: It identifies the open problems in the field of Federated Learning and proposes directions for future research. Through these contributions, the paper aims to help researchers gain an in - depth understanding of the current situation in the field of Federated Learning, identify areas that need further exploration, and provide practical information to promote the development of various security schemes and protect the privacy of participating entities in the Federated Learning setting.