Large Language Models for Software Engineering: A Systematic Literature Review

Xinyi Hou,Yanjie Zhao,Yue Liu,Zhou Yang,Kailong Wang,Li Li,Xiapu Luo,David Lo,John Grundy,Haoyu Wang
2024-04-10
Abstract:Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE). Many recent publications have explored LLMs applied to various SE tasks. Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages. To bridge this gap, we conducted a systematic literature review (SLR) on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes. We select and analyze 395 research papers from January 2017 to January 2024 to answer four key research questions (RQs). In RQ1, we categorize different LLMs that have been employed in SE tasks, characterizing their distinctive features and uses. In RQ2, we analyze the methods used in data collection, preprocessing, and application, highlighting the role of well-curated datasets for successful LLM for SE implementation. RQ3 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE. Finally, RQ4 examines the specific SE tasks where LLMs have shown success to date, illustrating their practical contributions to the field. From the answers to these RQs, we discuss the current state-of-the-art and trends, identifying gaps in existing research, and flagging promising areas for future study. Our artifacts are publicly available at <a class="link-external link-https" href="https://github.com/xinyi-hou/LLM4SE_SLR" rel="external noopener nofollow">this https URL</a>.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? The main purpose of this paper is to fill the research gap in the application of Large Language Models (LLMs) in the field of Software Engineering (SE) through a Systematic Literature Review (SLR). Specifically: 1. **Comprehensive overview of LLMs in SE**: - The paper aims to identify and classify various LLM architectures (such as decoder-only, encoder-decoder, and encoder-only models) used to address SE tasks, thereby providing a comprehensive application overview. 2. **Data processing methods**: - It investigates how SE-related datasets are collected, preprocessed, and used, including the criteria for dataset selection and preprocessing steps. 3. **Optimization and evaluation techniques**: - The paper explores techniques used to optimize and evaluate the performance of LLMs in SE, including Parameter Efficient Fine-Tuning (PEFT) methods and other prompt engineering techniques, and assesses commonly used evaluation metrics. 4. **Successful cases of specific SE tasks**: - It analyzes which specific SE tasks have successfully utilized LLMs, such as code generation and program repair, and showcases the specific scope and nature of these applications. By addressing the above questions, the paper not only provides the current status and development trends of LLMs in the SE field but also highlights the shortcomings in existing research and proposes potential directions for future research. This helps researchers and practitioners better understand how LLMs can be applied to SE and guides future exploration and progress.