Abstract:Large Language Models (LLMs) demonstrate impressive capabilities to generate accurate code snippets given natural language intents in zero-shot, i.e., without the need for specific fine-tuning. While prior studies have highlighted the advantages of fine-tuning LLMs, this process incurs high computational costs, making it impractical in resource-scarce environments, particularly for models with billions of parameters. To address these challenges, previous research explored In-Context Learning (ICL) as a strategy to guide the LLM generative process with task-specific prompt examples. However, ICL introduces inconveniences, such as the need for designing contextually relevant prompts and the absence of learning task-specific parameters, thereby limiting downstream task performance. In this context, we foresee Parameter-Efficient Fine-Tuning (PEFT) techniques as a promising approach to efficiently specialize LLMs to task-specific data while maintaining reasonable resource consumption. In this paper, we deliver a comprehensive study of PEFT techniques for LLMs under the automated code generation scenario. Our comprehensive investigation of PEFT techniques for LLMs reveals their superiority and potential over ICL across a diverse set of LLMs. Additionally, we demonstrate the extended capabilities of PEFT, showcasing its ability to learn from two distinct datasets jointly without compromising performance. Furthermore, our study highlights the potential for tuning larger LLMs and significant reductions in memory usage by combining PEFT with quantization. Therefore, this study opens opportunities for broader applications of PEFT in software engineering scenarios. Our code is available at <a class="link-external link-https" href="https://github.com/martin-wey/peft-llm-code/" rel="external noopener nofollow">this https URL</a>.

Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data

Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications

LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification

How to Train Data-Efficient LLMs

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model

Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models

LLMs Are Few-Shot In-Context Low-Resource Language Learners

A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs

Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges

Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts

eP-ALM: Efficient Perceptual Augmentation of Language Models

Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification

Large Language Models Are Zero-Shot Text Classifiers

Instruction Tuning Vs. In-Context Learning: Revisiting Large Language Models in Few-Shot Computational Social Science