Abstract:Private data, being larger and quality-higher than public data, can greatly improve large language models (LLM). However, due to privacy concerns, this data is often dispersed in multiple silos, making its secure utilization for LLM training a challenge. Federated learning (FL) is an ideal solution for training models with distributed private data, but traditional frameworks like FedAvg are unsuitable for LLM due to their high computational demands on clients. An alternative, split learning, offloads most training parameters to the server while training embedding and output layers locally, making it more suitable for LLM. Nonetheless, it faces significant challenges in security and efficiency. Firstly, the gradients of embeddings are prone to attacks, leading to potential reverse engineering of private data. Furthermore, the server's limitation of handle only one client's training request at a time hinders parallel training, severely impacting training efficiency. In this paper, we propose a Federated Learning framework for LLM, named FL-GLM, which prevents data leakage caused by both server-side and peer-client attacks while improving training efficiency. Specifically, we first place the input block and output block on local client to prevent embedding gradient attacks from server. Secondly, we employ key-encryption during client-server communication to prevent reverse engineering attacks from peer-clients. Lastly, we employ optimization methods like client-batching or server-hierarchical, adopting different acceleration methods based on the actual computational capabilities of the server. Experimental results on NLU and generation tasks demonstrate that FL-GLM achieves comparable metrics to centralized chatGLM model, validating the effectiveness of our federated learning framework.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to use private data scattered in multiple data silos to train large - language models (LLMs) while protecting user privacy. Specifically, the paper focuses on: 1. **Privacy protection**: Due to privacy issues, private data is usually stored in different devices or institutions and is difficult to be used centrally. Although traditional federated learning frameworks such as FedAvg can protect data privacy to a certain extent, they are not suitable for training large - language models because of their high requirements for client - side computing resources. 2. **Computing efficiency**: Although existing hierarchical learning methods (such as FedBERT) can partially solve the problem of computing resources, there are still security problems (such as embedding gradient attacks) and problems of low training efficiency. To meet these challenges, the paper proposes a new federated learning framework - FL - GLM, aiming at: - **Preventing data leakage**: By placing input blocks and output blocks on local clients, embedding gradient attacks on the server side are prevented. At the same time, key encryption is adopted when clients communicate with the server to prevent reverse - engineering attacks from other clients. - **Improving training efficiency**: Through the methods of client - batching and server - hierarchical, the parallelism and efficiency of training are improved. ### Specific contributions of the paper 1. **Designed a federated learning framework specifically for large - language models**: Starting from user privacy protection and computing resource requirements, the hierarchical learning method is improved, and a reasonable, effective and secure federated learning framework is developed. 2. **Proposed optimization methods for client - batching and server - hierarchical**: According to the computing power of the server, multiple methods for accelerating training are proposed, which solves the problem of low training efficiency in hierarchical learning. 3. **Experimentally verified the effectiveness of the framework**: The experimental results on SuperGLUE and abstract generation tasks show that the FL - GLM model can achieve performance comparable to that of the centralized ChatGLM model, verifying the effectiveness of the framework. ### Summary By proposing the FL - GLM framework, this paper successfully solves the problem of using private data to train large - language models while protecting user privacy, and improves the training efficiency through various optimization methods. The experimental results further prove the effectiveness and practicality of this framework.

Safely Learning with Private Data: A Federated Learning Framework for Large Language Model

FedDGP: Disentangling Global and Personal Models for Federated Learning

OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Federated Large Language Models: Current Progress and Future Directions

Secure and Efficient Decentralized Federated Learning with Data Representation Protection

MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning

VFLR: An Efficient and Privacy-Preserving Vertical Federated Framework for Logistic Regression

Federated Large Language Model: Solutions, Challenges and Future Directions

CG-FedLLM: How to Compress Gradients in Federated Fune-tuning for Large Language Models

Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data

A flexible and privacy-preserving federated learning framework based on logistic regression

Efficient Vertical Federated Learning with Secure Aggregation

Towards Federated Large Language Models: Motivations, Methods, and Future Directions

Fluent: Round-efficient Secure Aggregation for Private Federated Learning

FedV: Privacy-Preserving Federated Learning over Vertically Partitioned Data

LF3PFL: A Practical Privacy-Preserving Federated Learning Algorithm Based on Local Federalization Scheme

CELLM: An Efficient Communication in Large Language Models Training for Federated Learning

Secure Vertical Federated Learning Under Unreliable Connectivity

FedJudge: Federated Legal Large Language Model

A Lightweight and Accuracy-Lossless Privacy-Preserving Method in Federated Learning

A New Implementation of Federated Learning for Privacy and Security Enhancement