Abstract:Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models. Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning. In this paper, we first uncover a fundamental connection between the optimization processes of LoRA and full fine-tuning: using LoRA for optimization is mathematically equivalent to full fine-tuning using a low-rank gradient for parameter updates. And this low-rank gradient can be expressed in terms of the gradients of the two low-rank matrices in LoRA. Leveraging this insight, we introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of these low-rank matrices. This adjustment allows the low-rank gradient to more accurately approximate the full fine-tuning gradient, thereby narrowing the performance gap between LoRA and full fine-tuning. Furthermore, we theoretically derive the optimal solutions for adjusting the gradients of the low-rank matrices, applying them during fine-tuning in LoRA-Pro. We conduct extensive experiments across natural language understanding, dialogue generation, mathematical reasoning, code generation, and image classification tasks, demonstrating that LoRA-Pro substantially improves LoRA's performance, effectively narrowing the gap with full fine-tuning. Code is publicly available at \url{<a class="link-external link-https" href="https://github.com/mrflogs/LoRA-Pro" rel="external noopener nofollow">this https URL</a>}.

Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning

Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models

LoRA$^2$ : Multi-Scale Low-Rank Approximations for Fine-Tuning Large Language Models

LoRA Learns Less and Forgets Less

Delta-LoRA: Fine-Tuning High-Rank Parameters with the Delta of Low-Rank Matrices

Chain-of-LoRA: Enhancing the Instruction Fine-Tuning Performance of Low-Rank Adaptation on Diverse Instruction Set

ResLoRA: Identity Residual Mapping in Low-Rank Adaption

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization

Sparse Low-rank Adaptation of Pre-trained Language Models

LoRA-Mini : Adaptation Matrices Decomposition and Selective Training

LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models

LoRTA: Low Rank Tensor Adaptation of Large Language Models

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning

AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning

Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape

The Expressive Power of Low-Rank Adaptation

LoRA-Pro: Are Low-Rank Adapters Properly Optimized?

HyperLoRA: Efficient Cross-task Generalization Via Constrained Low-Rank Adapters Generation

Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs

FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning