Abstract:With the arrival of the era of the Internet of Things (IoT), the rapid development of new technologies such as artificial intelligence, big data, and other advanced techniques, data is growing geometrically. These data are prone to form data silos, scattered in various places. Jointly decentralized data applications will accelerate the progress and development of the times, but also face the challenge of data privacy protection. Federated learning (FL), a new branch of distributed machine learning, is trained by collaborative training to obtain global model without direct exposure to local datasets. Some studies have shown that typically federated learning involves a larger number of participants, it can lead to a significant increase in communication overhead, resulting in issues such as higher latency and bandwidth consumption. We suggest masking a subset of diverse participants and allowing the remaining participants to proceed with the next communication round of updates. Our aim is for reducing the communication overhead and improving the convergence performance of the global model on the premise of heterogeneous data privacy protection. We design a private masking approach PrivMaskFL to address two problems. We firstly propose a dynamic participant aggregation masking approach, which adopts the greedy ideology to select the relatively important participants and mask the unimportant ones; secondly, we design an adaptive differential privacy approach, which adaptively stratifies privacy budget according to the characteristics of participant, allocates the budget in a fine-grained stratified level, and adds Gaussian noise reasonably. Specifically, in each communication round, the participant’s model needs to perform local differential privacy noise addition for uplink parameter transmission; the server aggregates to acquire global model, finds a candidate participant subset based on the smaller parameter divergence by using the greedy algorithm approximation for t + 1 -th communication round for downlink parameter transmission. Subsequently, the privacy budget sequence is divided and granted to the participants of the stratified level, and the Gaussian noise addition of adaptive differential privacy is completed to achieve privacy protection without compromising the model’s usability. In experiments, our approach reduces the communication overhead and improve the convergence performance. Furthermore, our approach achieves higher accuracy and robust variance on both FMNIST and FEMNIST datasets.

Masking-enabled Data Protection Approach for Accurate Split Learning

Privacy-Preserving Collaborative Deep Learning with Unreliable Participants.

A Distributed Privacy-Preserving Framework for Deep Learning with Edge-Cloud Computing.

Make Split, not Hijack: Preventing Feature-Space Hijacking Attacks in Split Learning

Secure Split Learning Against Property Inference, Data Reconstruction, and Feature Space Hijacking Attacks

Protecting Split Learning by Potential Energy Loss

Evaluating Privacy Leakage in Split Learning

Enhancing Accuracy-Privacy Trade-off in Differentially Private Split Learning

Split Ways: Privacy-Preserving Training of Encrypted Data Using Split Learning

SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning

Decentralized Proactive Model Offloading and Resource Allocation for Split and Federated Learning

Functionality and Data Stealing by Pseudo-Client Attack and Target Defenses in Split Learning

Dullahan: Stealthy Backdoor Attack against Without-Label-Sharing Split Learning

Privacy-Preserving Split Learning with Vision Transformers using Patch-Wise Random and Noisy CutMix

A Stealthy Backdoor Attack for Without-Label-Sharing Split Learning.

Split Learning without Local Weight Sharing to Enhance Client-side Data Privacy

How to backdoor split learning

PrivMaskFL: A private masking approach for heterogeneous federated learning in IoT

Love or Hate? Share or Split? Privacy-Preserving Training Using Split Learning and Homomorphic Encryption

Split Learning for Distributed Collaborative Training of Deep Learning Models in Health Informatics

Insuring against the perils in distributed learning: privacy-preserving empirical risk minimization