Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

Abhinav Jain,Swarat Chaudhuri,Thomas Reps,Chris Jermaine
2024-11-01
Abstract:Parameter-Efficient Fine-Tuning (PEFT) has become the standard for customising Foundation Models (FMs) to user-specific downstream tasks. However, typical PEFT methods require storing multiple task-specific adapters, creating scalability issues as these adapters must be housed and run at the FM server. Traditional prompt tuning offers a potential solution by customising them through task-specific input prefixes, but it under-performs compared to other PEFT methods like LoRA. To address this gap, we propose Low-Rank Prompt Adaptation (LoPA), a prompt-tuning-based approach that performs on par with state-of-the-art PEFT methods and full fine-tuning while being more parameter-efficient and not requiring a server-based adapter. LoPA generates soft prompts by balancing between sharing task-specific information across instances and customization for each instance. It uses a low-rank decomposition of the soft-prompt component encoded for each instance to achieve parameter efficiency. We provide a comprehensive evaluation on multiple natural language understanding and code generation and understanding tasks across a wide range of foundation models with varying sizes.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the scalability and performance issues faced by Parameter-Efficient Fine-Tuning (PEFT) methods when customizing Foundation Models (FMs). Specifically: 1. **Scalability Issue**: Traditional PEFT methods require storing multiple task-specific adapters, which leads to increased storage and operational costs on servers, especially when dealing with a large number of user-specific tasks. 2. **Performance Issue**: Although traditional prompt tuning methods have the advantage of high parameter efficiency, their performance is often inferior to other PEFT methods, such as Low-Rank Adaptation (LoRA). To tackle these issues, the authors propose the Low-Rank Prompt Adaptation (LoPA) method. LoPA generates soft prompts by balancing task-specific and instance-specific information, achieving performance comparable to state-of-the-art PEFT methods while maintaining parameter efficiency. Specifically, LoPA uses low-rank decomposition to reduce the number of parameters and combines task-specific and instance-specific information through a gating function. ### Main Contributions 1. **Proposing LoPA**: A parameter-efficient and high-performance prompt tuning strategy. 2. **Validating Effectiveness**: Extensive experiments on various natural language understanding and code generation tasks validate the effectiveness of LoPA. The results show that LoPA outperforms existing prompt tuning methods on multiple tasks and, in some cases, even surpasses the performance of full fine-tuning and LoRA. ### Experimental Results - **Natural Language Understanding Tasks**: On six benchmark tasks of the GLUE dataset, LoPA significantly outperforms traditional prompt tuning methods and DePT, with an average improvement of 28.62 percentage points and 25.39 percentage points, respectively. Additionally, LoPA excels in limited data settings, such as improving by 12.5 percentage points on the MRPC task and 6.13 percentage points on the RTE task. - **Code Understanding Tasks**: On tasks from the CruxEval dataset, LoPA significantly improves the performance of baseline models, especially on larger foundation models like LLama-3 and Phi-3, with performance gains ranging from 8 to 11 percentage points. - **Code Generation Tasks**: On the code generation tasks of the MBPP dataset, LoPA achieves performance improvements comparable to IDPG while significantly reducing the number of parameters. ### Conclusion By combining task-specific and instance-specific information and using low-rank decomposition to reduce the number of parameters, LoPA successfully addresses the scalability and performance shortcomings of traditional PEFT methods. Experimental results demonstrate that LoPA performs excellently across various tasks, making it an efficient and effective model fine-tuning method.