DeepCommenter: a Deep Code Comment Generation Tool with Hybrid Lexical and Syntactical Information

Boao Li,Meng Yan,Xin Xia,Xing Hu,Ge Li,David Lo
DOI: https://doi.org/10.1145/3368089.3417926
2020-01-01
Abstract:As the scale of software projects increases, the code comments are more and more important for program comprehension. Unfortunately, many code comments are missing, mismatched or outdated due to tight development schedule or other reasons. Automatic code comment generation is of great help for developers to comprehend source code and reduce their workload. Thus, we propose a code comment generation tool (DeepCommenter) to generate descriptive comments for Java methods. DeepCommenter formulates the comment generation task as a machine translation problem and exploits a deep neural network that combines the lexical and structural information of Java methods. We implement DeepCommenter in the form of an Integrated Development Environment (i.e., Intellij IDEA) plug-in. Such plug-in is built upon a Client/Server architecture. The client formats the code selected by the user, sends request to the server and inserts the comment generated by the server above the selected code. The server listens for client’s request, analyzes the requested code using the pre-trained model and sends back the generated comment to the client. The pre-trained model learns both the lexical and syntactical information from source code tokens and Abstract Syntax Trees (AST) respectively and combines these two types of information together to generate comments. To evaluate DeepCommenter, we conduct experiments on a large corpus built from a large number of open source Java projects on GitHub. The experimental results on different metrics show that DeepCommenter outperforms the state-of-the-art approaches by a substantial margin.
What problem does this paper attempt to address?