Abstract:Abstract Advances in Natural Language Processing (NLP) have been significantly driven by the adoption of comprehensive pretrained language models (PLMs) such as BERT, RoBERTa. Nevertheless, increasing concerns regarding data privacy and the enforcement of stringent data protection regulations, such as PDPA and GDPR, have highlighted the limitations of traditional centralized machine learning methods. Federated Learning (FL) emerges as a promising solution that mitigates privacy concerns by training models on client devices and aggregating the parameters on a central server, thus avoiding direct data transfer. Despite its considerable potential, FL?s application in NLP faces numerous challenges. These include the need to replicate architectural designs across different devices, the management of non-Independent and Identically Distributed (non-IID) data, and the significant communication overhead caused by frequent transfers of model parameters. Federated Distillation (FD) has been proposed to overcome these challenges by facilitating information transfer through unlabeled public proxy datasets. This approach helps to reduce communication costs and promotes collaboration among different model architectures. However, FD carries potential privacy risks and may result in substantial loss of previously acquired knowledge, thus diminishing the model's effectiveness. To address these issues, we introduce the Privacy-preserving Federated Distillation Method for Pretraining Language Models (PFDP). Our approach distinguishes itself from conventional methods by injecting noise into a select portion of a predetermined dataset, thereby minimizing its impact on the model's utility. Besides, PFDP utilizes transfer learning to improve the generalization abilities of the global model and reduce the impact of catastrophic forgetting. The extensive assessment across many classification tasks illustrates the efficacy of PFDP in enhancing accuracy while safeguarding privacy.

PPKD: Privacy-preserving Knowledge Distillation for Large Model

PKDGAN: Private Knowledge Distillation with Generative Adversarial Networks

Private Knowledge Transfer via Model Distillation with Generative Adversarial Networks

Privacy-Preserving Collaborative Deep Learning with Unreliable Participants.

Safe Distillation Box

Learning Privacy-Preserving Student Networks via Discriminative-Generative Distillation

Privacy-Preserving Student Learning with Differentially Private Data-Free Distillation

Locally Differentially Private Distributed Deep Learning via Knowledge Distillation

Swing Distillation: A Privacy-Preserving Knowledge Distillation Framework

A Privacy Knowledge Transfer Method for Clinical Concept Extraction.

Fine-grained Private Knowledge Distillation

Resisting membership inference attacks through knowledge distillation

Model Conversion via Differentially Private Data-Free Distillation

Personalized and privacy-enhanced federated learning framework via knowledge distillation

Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation

DDK: Distilling Domain Knowledge for Efficient Large Language Models

Pre-training Distillation for Large Language Models: A Design Space Exploration

Differentially Private Knowledge Distillation for Mobile Analytics

PFDP: Privacy-preserving Federated Distillation Method for Pretraining Language Models

Undistillable: Making A Nasty Teacher That CANNOT teach students