A review of automatic source code summarization

Xuejun Zhang,Xia Hou,Xiuming Qiao,Wenfeng Song
DOI: https://doi.org/10.1007/s10664-024-10553-6
IF: 3.762
2024-10-08
Empirical Software Engineering
Abstract:Code summarization plays a pivotal role in the field of software engineering by offering developers a concise natural language comprehension of source code semantics. As software complexity continues to escalate, code summarization confronts various challenges, including discrepancies between source code and summarization, the absence of crucial or up-to-date information, and the inefficiency and resource demands of manual summarization. To address these challenges, Automatic Source Code Summarization (ASCS) has garnered widespread attention. This paper presents a comprehensive review and synthesis of ASCS research. It aims to provide an in-depth understanding of the core issues and challenges inherent in each phase of ASCS, illustrated with specific examples and application scenarios. Around of the core phases of ASCS including data collection, source code modeling, the generation of code summaries, and the assessment of their quality, the paper thoroughly compiles and assesses existing datasets, categorizes and examines prevalent source code modeling techniques, and delves into the methods for generating and evaluating the quality of code summaries. Concluding with an exploration of future research avenues and emerging trends, this paper serves as a guide for readers to grasp the cutting-edge developments in this field, enriched by the analysis of pivotal research contributions.
computer science, software engineering
What problem does this paper attempt to address?