Function Call Graph Context Encoding for Neural Source Code Summarization

Aakash Bansal,Zachary Eberhart,Zachary Karas,Yu Huang,Collin McMillan
DOI: https://doi.org/10.1109/tse.2023.3279774
IF: 7.4
2023-01-01
IEEE Transactions on Software Engineering
Abstract:Source code summarization is the task of writing natural language descriptions of source code. The primary use of these descriptions is in documentation for programmers. Automatic generation of these descriptions is a high value research target due to the time cost to programmers of writing these descriptions themselves. In recent years, a confluence of software engineering and artificial intelligence research has made inroads into automatic source code summarization through applications of neural models of that source code. However, an Achilles' heel to a vast majority of approaches is that they tend to rely solely on the context provided by the source code being summarized. But empirical studies in program comprehension are quite clear that the information needed to describe code much more often resides in the context in the form of Function Call Graph surrounding that code. In this paper, we present a technique for encoding this call graph context for neural models of code summarization. We implement our approach as a supplement to existing approaches, and show statistically significant improvement over existing approaches. In a human study with 20 programmers, we show that programmers perceive generated summaries to generally be as accurate, readable, and concise as human-written summaries.
engineering, electrical & electronic,computer science, software engineering
What problem does this paper attempt to address?