Abstract:The growing demand for larger-scale models in the development of \textbf{L}arge \textbf{L}anguage \textbf{M}odels (LLMs) poses challenges for efficient training within limited computational resources. Traditional fine-tuning methods often exhibit instability in multi-task learning and rely heavily on extensive training resources. Here, we propose MoDULA (\textbf{M}ixture \textbf{o}f \textbf{D}omain-Specific and \textbf{U}niversal \textbf{L}oR\textbf{A}), a novel \textbf{P}arameter \textbf{E}fficient \textbf{F}ine-\textbf{T}uning (PEFT) \textbf{M}ixture-\textbf{o}f-\textbf{E}xpert (MoE) paradigm for improved fine-tuning and parameter efficiency in multi-task learning. The paradigm effectively improves the multi-task capability of the model by training universal experts, domain-specific experts, and routers separately. MoDULA-Res is a new method within the MoDULA paradigm, which maintains the model's general capability by connecting universal and task-specific experts through residual connections. The experimental results demonstrate that the overall performance of the MoDULA-Flan and MoDULA-Res methods surpasses that of existing fine-tuning methods on various LLMs. Notably, MoDULA-Res achieves more significant performance improvements in multiple tasks while reducing training costs by over 80\% without losing general capability. Moreover, MoDULA displays flexible pluggability, allowing for the efficient addition of new tasks without retraining existing experts from scratch. This progressive training paradigm circumvents data balancing issues, enhancing training efficiency and model stability. Overall, MoDULA provides a scalable, cost-effective solution for fine-tuning LLMs with enhanced parameter efficiency and generalization capability.

Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models

Multimodal Instruction Tuning with Conditional Mixture of LoRA

MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts

MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning

A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs Using the CGC-LORA Algorithm

MoR: Mixture of Ranks for Low-Rank Adaptation Tuning

Mixture of LoRA Experts

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications

MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models

LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs

MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

Learning Attentional Mixture of LoRAs for Language Model Continual Learning

MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs

Higher Layers Need More LoRA Experts

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models