MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Ting Jiang,Shaohan Huang,Shengyue Luo,Zihan Zhang,Haizhen Huang,Furu Wei,Weiwei Deng,Feng Sun,Qi Zhang,Deqing Wang,Fuzhen Zhuang

2024-05-20

Abstract:Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. In this paper, we analyze the impact of low-rank updating, as implemented in LoRA. Our findings suggest that the low-rank updating mechanism may limit the ability of LLMs to effectively learn and memorize new knowledge. Inspired by this observation, we propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters. To achieve it, we introduce the corresponding non-parameter operators to reduce the input dimension and increase the output dimension for the square matrix. Furthermore, these operators ensure that the weight can be merged back into LLMs, which makes our method can be deployed like LoRA. We perform a comprehensive evaluation of our method across five tasks: instruction tuning, mathematical reasoning, continual pretraining, memory and pretraining. Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.

Computation and Language,Machine Learning

What problem does this paper attempt to address?

The paper mainly discusses the impact of Low-Rank Adaptation (LoRA) on Large Language Models (LLMs) in the Parameter-Efficient Fine-Tuning (PEFT) method. LoRA reduces the memory required for training by updating parameters with low-rank matrices, but the authors found that this method may limit the ability of LLMs to learn and memorize new knowledge. To address this issue, they propose a new method called Matrix-based High-Rank Updating (MoRA), which uses square matrices to achieve high-rank updates while maintaining the same number of trainable parameters as LoRA. By introducing non-parametric operators, MoRA can reduce the input dimensionality and increase the output dimensionality, ensuring that the weights can be merged back into LLMs just like LoRA. MoRA is extensively evaluated on various tasks, including instruction fine-tuning, mathematical reasoning, continual pretraining, and memory tasks. The results show that MoRA outperforms LoRA on memory-intensive tasks and performs similarly to LoRA on other tasks. The paper also analyzes different types of fine-tuning tasks, such as instruction fine-tuning focusing on format interaction rather than knowledge acquisition, while mathematical reasoning and continual pretraining require enhanced knowledge and capabilities. The experiments demonstrate that low-rank updates have limitations in memory-intensive tasks, and MoRA overcomes this issue by using high-rank updates. The paper also discusses related work on LoRA and other PEFT methods, as well as how to improve LoRA by increasing the rank of the matrix. Finally, by comparing the performance of LoRA and MoRA on different tasks, the effectiveness of high-rank updates is proven.

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

MoR: Mixture of Ranks for Low-Rank Adaptation Tuning

Matrix-Transformation Based Low-Rank Adaptation (MTLoRA): A Brain-Inspired Method for Parameter-Efficient Fine-Tuning

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning

AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning

Sparse Low-rank Adaptation of Pre-trained Language Models

LoRA-Pro: Are Low-Rank Adapters Properly Optimized?

Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models

MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

Structure-Aware Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

LoRA-Mini : Adaptation Matrices Decomposition and Selective Training

NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

LoRTA: Low Rank Tensor Adaptation of Large Language Models

MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization

CoRA: Optimizing Low-Rank Adaptation with Common Subspace of Large Language Models

LoRA Learns Less and Forgets Less

MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning