Generative Software Engineering

Yuan Huang,Yinan Chen,Xiangping Chen,Junqi Chen,Rui Peng,Zhicao Tang,Jinbo Huang,Furen Xu,Zibin Zheng

2024-04-03

Abstract:The rapid development of deep learning techniques, improved computational power, and the availability of vast training data have led to significant advancements in pre-trained models and large language models (LLMs). Pre-trained models based on architectures such as BERT and Transformer, as well as LLMs like ChatGPT, have demonstrated remarkable language capabilities and found applications in Software engineering. Software engineering tasks can be divided into many categories, among which generative tasks are the most concern by researchers, where pre-trained models and LLMs possess powerful language representation and contextual awareness capabilities, enabling them to leverage diverse training data and adapt to generative tasks through fine-tuning, transfer learning, and prompt engineering. These advantages make them effective tools in generative tasks and have demonstrated excellent performance. In this paper, we present a comprehensive literature review of generative tasks in SE using pre-trained models and LLMs. We accurately categorize SE generative tasks based on software engineering methodologies and summarize the advanced pre-trained models and LLMs involved, as well as the datasets and evaluation metrics used. Additionally, we identify key strengths, weaknesses, and gaps in existing approaches, and propose potential research directions. This review aims to provide researchers and practitioners with an in-depth analysis and guidance on the application of pre-trained models and LLMs in generative tasks within SE.

Software Engineering

What problem does this paper attempt to address?

The paper focuses on the application of pre-trained models and large language models (LLMs) in software engineering (SE) for generation tasks. The current research literature has limited analysis in this area, lacking comprehensive analysis of the model development stages, distinction of different types of tasks, and systematic analysis. This paper aims to classify software engineering generation tasks based on pre-trained models and LLMs through a systematic literature review, summarize related state-of-the-art models, datasets, and evaluation metrics, identify the strengths, weaknesses, and research gaps of existing methods, and propose possible future research directions. Specifically, the generation tasks include requirement generation, code generation, code summarization, test case generation, patch generation, code optimization, and code translation. The paper also describes the methods, selection criteria, and data analysis process used for literature retrieval to ensure comprehensiveness and accuracy.

Generative Software Engineering

Autonomous Agents in Software Development: A Vision Paper

Large Language Models for Software Engineering: A Systematic Literature Review

Towards an Understanding of Large Language Models in Software Engineering Tasks

A Survey on Large Language Models for Software Engineering

Machine/Deep Learning for Software Engineering: A Systematic Literature Review

Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code.

Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code

A Survey on Deep Learning for Software Engineering

Deep Learning in Software Engineering

A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends

Synergy between Machine/Deep Learning and Software Engineering: How Far Are We?

Software Service Engineering in the Era of Large Language Models

Deep Learning for Source Code Modeling and Generation

Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges

Efficient and Green Large Language Models for Software Engineering: Literature Review, Vision, and the Road Ahead

A Survey on Large Language Models for Code Generation

How Far Are We? The Triumphs and Trials of Generative AI in Learning Software Engineering

From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research