Abstract:The scalability of large language models (LLMs) in handling high-complexity models and large-scale datasets has led to tremendous successes in pivotal domains. While there is an urgent need to acquire more training data for LLMs, a concerning reality is the depletion of high-quality public datasets within a few years. In view of this, the federated learning (FL) LLM fine-tuning paradigm recently has been proposed to facilitate collaborative LLM fine-tuning on distributed private data, where multiple data owners collaboratively fine-tune a shared LLM without sharing raw data. However, the staggering model size of LLMs imposes heavy computing and communication burdens on clients, posing significant barriers to the democratization of the FL LLM fine-tuning paradigm. To address this issue, split learning (SL) has emerged as a promising solution by offloading the primary training workload to a server via model partitioning while exchanging activation/activation's gradients with smaller data sizes rather than the entire LLM. Unfortunately, research on the SL LLM fine-tuning paradigm is still in its nascent stage. To fill this gap, in this paper, we propose the first SL LLM fine-tuning framework, named SplitLoRA. SplitLoRA is built on the split federated learning (SFL) framework, amalgamating the advantages of parallel training from FL and model splitting from SL and thus greatly enhancing the training efficiency. It is worth noting that SplitLoRA is the inaugural open-source benchmark for SL LLM fine-tuning, providing a foundation for research efforts dedicated to advancing SL LLM fine-tuning. Extensive simulations validate that SplitLoRA achieves target accuracy in significantly less time than state-of-the-art LLM fine-tuning frameworks, demonstrating the superior training performance of SplitLoRA. The project page is available at <a class="link-external link-https" href="https://fduinc.github.io/splitlora/" rel="external noopener nofollow">this https URL</a>.

Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

Self-Data Distillation for Recovering Quality in Pruned Large Language Models

Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages

Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model

Natural Language Fine-Tuning

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

SplitLoRA: A Split Parameter-Efficient Fine-Tuning Framework for Large Language Models

Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging

LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation

From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data

Dynamic Corrective Self-Distillation for Better Fine-Tuning of Pretrained Models

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning

An Emulator for Fine-Tuning Large Language Models using Small Language Models