Abstract:We present a novel Parameter-Efficient Fine-Tuning (PEFT) method, dubbed as Adaptive Freezing of Low Rank Adaptation (AFLoRA). Specifically, for each pre-trained frozen weight tensor, we add a parallel path of trainable low-rank matrices, namely a down-projection and an up-projection matrix, each of which is followed by a feature transformation vector. Based on a novel freezing score, we the incrementally freeze these projection matrices during fine-tuning to reduce the computation and alleviate over-fitting. Our experimental results demonstrate that we can achieve state-of-the-art performance with an average improvement of up to $0.85\%$ as evaluated on GLUE benchmark while yeilding up to $9.5\times$ fewer average trainable parameters. While compared in terms of runtime, AFLoRA can yield up to $1.86\times$ improvement as opposed to similar PEFT alternatives. Besides the practical utility of our approach, we provide insights on the trainability requirements of LoRA paths at different modules and the freezing schedule for the different projection matrices. Code will be released.

What problem does this paper attempt to address?

This paper attempts to address the problem of how to maintain or improve model performance in parameter-efficient fine-tuning (PEFT) of large pre-trained models while reducing the number of trainable parameters and computational costs. Specifically, the paper proposes a new method—Adaptive Freezing Low-Rank Adaptation (AFLoRA), which aims to achieve this goal by dynamically freezing the projection matrices in the low-rank paths. ### Main Issues 1. **Parameter Efficiency**: Existing PEFT methods like LoRA and ELoRA reduce the number of trainable parameters but still incur certain computational overhead and may require a high rank to maintain performance. 2. **Overfitting Problem**: Reducing the number of trainable parameters helps alleviate overfitting, but effectively reducing parameters while maintaining performance is a challenge. 3. **Computational Efficiency**: How to further improve computational efficiency, reduce runtime, and computational load while reducing the number of parameters. ### Solution The paper proposes the AFLoRA method, with the main contributions as follows: 1. **Low-Rank Paths**: Adding a parallel low-rank path to each frozen weight tensor in the pre-trained model, including a down-projection matrix, an up-projection matrix, and a feature transformation vector. 2. **Adaptive Freezing**: Gradually freezing these projection matrices based on a novel freezing score to reduce computational load and alleviate overfitting. 3. **Performance Improvement**: Experimental results show that AFLoRA improves average performance by 0.85% on the GLUE benchmark, while reducing the number of trainable parameters by 9.5 times, improving runtime by 1.86 times, and reducing computational load by 2.96 times. ### Experimental Validation The paper conducts extensive experiments on multiple NLP benchmark datasets, comparing AFLoRA with existing methods such as LoRA and ELoRA, validating the effectiveness of AFLoRA. The experimental results show that AFLoRA not only outperforms or matches existing methods in terms of performance but also excels in parameter efficiency and computational efficiency. ### Conclusion AFLoRA successfully maintains or even improves model performance while reducing the number of trainable parameters and computational costs by adaptively freezing the projection matrices in the low-rank paths, providing a new solution for parameter-efficient fine-tuning of large-scale pre-trained models.

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

LoRA-GA: Low-Rank Adaptation with Gradient Approximation

LoRTA: Low Rank Tensor Adaptation of Large Language Models

DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution

Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning

LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models

Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models

GeoLoRA: Geometric integration for parameter efficient fine-tuning

RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation

LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning

Parameter-Efficient Fine-Tuning with Discrete Fourier Transform

DoRA: Weight-Decomposed Low-Rank Adaptation

Improving LoRA in Privacy-preserving Federated Learning

InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization