An Empirical Study of Parameter-Efficient Fine-Tuning Methods for Pre-Trained Code Models.

Jiaxing Liu,Chaofeng Sha,Xin Peng
DOI: https://doi.org/10.1109/ase56229.2023.00125
2024-01-01
Abstract:Pre-trained code models (e.g. CodeBERT and CodeT5) have demonstrated their code intelligence in various software engineering tasks, such as code summarization. And full fine-tuning has become the typical approach to adapting these models to downstream tasks. However, full fine-tuning these large models can be computationally expensive and memory-intensive, particularly when training for multiple tasks. To alleviate this issue, several parameter-efficient fine-tuning methods (e.g. Adapter and LoRA) have been proposed to only train a small number of additional parameters, while keeping the original pre-trained parameters frozen. Although these methods claim superiority over the prior techniques, they seldom make a comprehensive and fair comparison on multiple software engineering tasks. Moreover, besides their potential in reducing fine-tuning costs and maintaining approximate performance, the effectiveness of these methods in low-resource, cross-language, and cross-project scenarios is inadequately studied. To this end, we first conduct experiments by fine-tuning state-of-the-art code models with these methods on both code understanding tasks and code generation tasks. The results show that, by tuning only 0.5% additional parameters, these methods may achieve comparable or higher performance than full fine-tuning in code understanding tasks, but they may exhibit slightly weaker performance in code generation tasks. We also investigate the impact of these methods with varying numbers of training samples and find that, a considerable number of samples (e.g. 1000 for clone detection) may be required for them to approximate the performance of full fine-tuning. Our experimental results in cross-language and cross-project scenarios demonstrate that by freezing most pre-trained parameters and tuning only 0.5% additional parameters, these methods achieve consistent improvements in models' transfer learning ability in comparison to full fine-tuning. Our code and data are available at https://github.com/anonymous-ase23/ CodeModelParameterEfficientFinetuning.
What problem does this paper attempt to address?