Abstract:Low-rank adaptation of large models, particularly LoRA, has gained traction due to its computational efficiency. This efficiency, contrasted with the prohibitive costs of full-model fine-tuning, means that practitioners often turn to LoRA and sometimes without a complete understanding of its ramifications. In this study, we focus on fairness and ask whether LoRA has an unexamined impact on utility, calibration, and resistance to membership inference across different subgroups (e.g., genders, races, religions) compared to a full-model fine-tuning baseline. We present extensive experiments across vision and language domains and across classification and generation tasks using ViT-Base, Swin-v2-Large, Llama-2 7B, and Mistral 7B. Intriguingly, experiments suggest that while one can isolate cases where LoRA exacerbates model bias across subgroups, the pattern is inconsistent -- in many cases, LoRA has equivalent or even improved fairness compared to the base model or its full fine-tuning baseline. We also examine the complications of evaluating fine-tuning fairness relating to task design and model token bias, calling for more careful fairness evaluations in future work.

What problem does this paper attempt to address?

The paper primarily explores the application of Low-Rank Adaptation (LoRA) technology in large-scale models and its impact on fairness. Specifically, the study focuses on the following aspects: 1. **Research Background and Motivation**: - With the widespread application of large-scale models, parameter-efficient fine-tuning techniques (such as LoRA) are favored for their computational efficiency. - However, there is currently little understanding of the impact of LoRA technology on model trustworthiness, particularly fairness and robustness. 2. **Core Issues**: - The core issue of the study is: Does LoRA technology affect the fairness of different subgroups (e.g., gender, race, etc.)? - By extensively comparing the effects of full model fine-tuning and LoRA technology, the study evaluates their performance on different tasks and datasets. 3. **Experimental Design and Results**: - The experiments cover multiple tasks in the visual and language domains, including but not limited to hate speech detection, gender classification, and machine translation. - Various pre-trained models are used for comparison, such as ViT-Base, Swin-v2-Large, Llama-2 7B, and Mistral 7B. - The results indicate that in some cases, LoRA may exacerbate model bias; however, in more cases, LoRA shows comparable or even better fairness compared to full model fine-tuning. 4. **Fairness Evaluation Metrics**: - The study uses multiple fairness evaluation metrics, such as subgroup accuracy disparity, worst subgroup accuracy, demographic parity difference (DPD), and equal opportunity difference (EOD). - It also evaluates the model's calibration ability across different subgroups and its resistance to membership inference attacks (MIA). 5. **Conclusion**: - Overall, LoRA technology does not significantly worsen fairness in many tasks and, in some cases, demonstrates better fairness. - Fairness results may depend on the quality of the underlying pre-trained model, and the rank size of LoRA has little impact on its fairness. - For generative tasks, LLM models may have unpredictable token biases, which can affect the accuracy of fairness evaluations. In summary, this paper aims to comprehensively evaluate the fairness performance of LoRA technology in large-scale model fine-tuning and proposes future research directions to further optimize model fairness.

On Fairness of Low-Rank Adaptation of Large Models

FairLoRA: Unpacking Bias Mitigation in Vision Models with Fairness-Driven Low-Rank Adaptation

Low-rank finetuning for LLMs: A fairness perspective

LoRA-Pro: Are Low-Rank Adapters Properly Optimized?

LoRA vs Full Fine-tuning: An Illusion of Equivalence

LoRA+: Efficient Low Rank Adaptation of Large Models

Towards Robust and Efficient Federated Low-Rank Adaptation with Heterogeneous Clients

BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models

The Expressive Power of Low-Rank Adaptation

LoRA Learns Less and Forgets Less

Sparse Low-rank Adaptation of Pre-trained Language Models

Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

BiLoRA: A Bi-level Optimization Framework for Overfitting-Resilient Low-Rank Adaptation of Large Pre-trained Models

A Survey on LoRA of Large Language Models

Batched Low-Rank Adaptation of Foundation Models

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

Low-Rank Adaptation for Multilingual Summarization: An Empirical Study

Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models

Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs