Abstract:Fine-tuning large-scale pretrained models is prohibitively expensive in terms of computational and memory costs. LoRA, as one of the most popular Parameter-Efficient Fine-Tuning (PEFT) methods, offers a cost-effective alternative by fine-tuning an auxiliary low-rank model that has significantly fewer parameters. Although LoRA reduces the computational and memory requirements significantly at each iteration, extensive empirical evidence indicates that it converges at a considerably slower rate compared to full fine-tuning, ultimately leading to increased overall compute and often worse test performance. In our paper, we perform an in-depth investigation of the initialization method of LoRA and show that careful initialization (without any change of the architecture and the training algorithm) can significantly enhance both efficiency and performance. In particular, we introduce a novel initialization method, LoRA-GA (Low Rank Adaptation with Gradient Approximation), which aligns the gradients of low-rank matrix product with those of full fine-tuning at the first step. Our extensive experiments demonstrate that LoRA-GA achieves a convergence rate comparable to that of full fine-tuning (hence being significantly faster than vanilla LoRA as well as various recent improvements) while simultaneously attaining comparable or even better performance. For example, on the subset of the GLUE dataset with T5-Base, LoRA-GA outperforms LoRA by 5.69% on average. On larger models such as Llama 2-7B, LoRA-GA shows performance improvements of 0.34, 11.52%, and 5.05% on MT-bench, GSM8K, and Human-eval, respectively. Additionally, we observe up to 2-4 times convergence speed improvement compared to vanilla LoRA, validating its effectiveness in accelerating convergence and enhancing model performance. Code is available at <a class="link-external link-https" href="https://github.com/Outsider565/LoRA-GA" rel="external noopener nofollow">this https URL</a>.

MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

LoRTA: Low Rank Tensor Adaptation of Large Language Models

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning.

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation

Matrix-Transformation Based Low-Rank Adaptation (MTLoRA): A Brain-Inspired Method for Parameter-Efficient Fine-Tuning

DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution

Lottery Rank-Pruning Adaptation Parameter Efficient Fine-Tuning

MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning

LoRA-GA: Low-Rank Adaptation with Gradient Approximation

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

LoRA$^2$ : Multi-Scale Low-Rank Approximations for Fine-Tuning Large Language Models

FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning