What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in distributed deep learning, how to train or infer models while protecting the privacy and confidentiality of data without directly accessing the original data of clients. Specifically, the paper explores how to collaborate among multiple entities (such as hospitals, financial institutions, etc.) to train deep neural networks without centrally storing or sharing data with other entities. ### Problem Background With the development of emerging technologies in fields such as biomedicine, health, and finance, distributed deep - learning methods are becoming increasingly important. These methods allow multiple entities to jointly train a deep neural network without sharing data or aggregating resources. However, due to trust and regulatory issues (such as HIPAA), certain sensitive data (such as medical data) cannot be directly shared. Therefore, new techniques need to be developed to ensure that when training models in a distributed environment, both data privacy can be protected and model performance can be maintained. ### Main Challenges 1. **Data Privacy Protection**: Ensure that during the training process, the server or client cannot directly access or infer the original data of other parties. 2. **Model Architecture and Parameter Protection**: In addition to protecting data, it is also necessary to protect the architecture and parameters of the model to prevent the leakage of sensitive information. 3. **Computing Resources and Communication Efficiency**: While protecting privacy, try to minimize the demand for computing resources (such as memory, time, communication bandwidth, etc.) to ensure the practical feasibility of the method. ### Solutions The paper mainly discusses several distributed deep - learning methods and their combinations: - **Federated Learning**: The client updates the model parameters locally and sends the updated parameters to the server, and the server then aggregates them. This method does not share the original data, but may leak the model architecture and parameters. - **Large Batch Synchronous SGD**: By introducing backup worker nodes, faster synchronous updates are achieved, but the computational and communication overheads are large. - **SplitNN**: The client only trains up to a certain layer (cut layer), and then sends the intermediate representation to the server, and the server continues to train the remaining layers. This method not only protects the original data but also protects the model architecture and parameters. In addition, the paper also explores combining techniques such as differential privacy, homomorphic encryption, oblivious transfer, and garbled circuits to further enhance privacy protection. ### Conclusions and Future Work The paper points out that in the case of a large number of clients, SplitNN performs best in terms of communication bandwidth and computing resources, while federated learning performs better when the number of clients is small. Future research directions include: - Improving the resource and communication efficiency of the no - peek method. - Combining neural network compression methods to further reduce network bandwidth requirements. - Studying the robustness against data - poisoning attacks to ensure the security of the model. - Applying these methods to distributed medical, clinical trials, cross - organizational collaboration, and finance fields. In summary, this paper aims to explore how to achieve efficient and secure data protection in distributed deep learning, especially in application scenarios involving sensitive data.

No Peek: A Survey of private distributed deep learning

Privacy-Preserving Collaborative Deep Learning with Unreliable Participants.

GELU-Net: A Globally Encrypted, Locally Unencrypted Deep Neural Network for Privacy-Preserved Learning.

A Distributed Privacy-Preserving Framework for Deep Learning with Edge-Cloud Computing.

Industrial Scale Privacy Preserving Deep Neural Network

NoPeek: Information leakage reduction to share activations in distributed deep learning

Secure Distributed Training at Scale

SPEED: Secure, PrivatE, and Efficient Deep learning

A(DP)$^2$SGD: Asynchronous Decentralized Parallel Stochastic Gradient Descent with Differential Privacy

Towards Efficient and Privacy-Preserving Federated Deep Learning

A(DP)$^2$2SGD: Asynchronous Decentralized Parallel Stochastic Gradient Descent with Differential Privacy

Private Knowledge Sharing in Distributed Learning: A Survey

From distributed machine learning to federated learning: In the view of data privacy and security

PrivyNet: A Flexible Framework for Privacy-Preserving Deep Neural Network Training

Big Data Intelligence Using Distributed Deep Neural Networks

Protection Against Reconstruction and Its Applications in Private Federated Learning

Security for Distributed Deep Neural Networks Towards Data Confidentiality & Intellectual Property Protection

How to Democratise and Protect AI: Fair and Differentially Private Decentralised Deep Learning

Decentralized Deep Learning for Multi-Access Edge Computing: A Survey on Communication Efficiency and Trustworthiness

Privacy preserving distributed training of neural networks