Code comment generation based on graph neural network enhanced transformer model for code understanding in open-source software ecosystems

Li Kuang,Cong Zhou,Xiaoxian Yang
DOI: https://doi.org/10.1007/s10515-022-00341-1
IF: 1.677
2022-06-16
Automated Software Engineering
Abstract:In open-source software ecosystems, the scale of source code is getting larger and larger, and developers often use various methods (good code comments or method names, etc.) to make the code easier to read and understand. However, high-quality code comments or method names are often unavailable due to tight project schedules or other reasons in open-source software ecosystems such as Github. Therefore, in this work, we try to use deep learning models to generate appropriate code comments or method names to help software development and maintenance, which requires a non-trivial understanding of the code. Therefore, we propose a Graph neural network enhanced Transformer model (GTrans for short) to learn code representation to understand code better. Specifically, GTrans learns code representation from code sequences and graphs. We use a Transformer encoder to capture the global representation from code sequence and a graph neural network (GNN) encoder to focus on the local details in the code graph, and then use a decoder to combine both global and local representations by attention mechanism. We use three public datasets collected from GitHub to evaluate our model. In an extensive evaluation, we show that GTrans outperforms the state-of-the-art models up to 3.8% increase in METEOR metrics on code comment generation and outperforms the state-of-the-art models by margins of 5.8%–9.4% in ROUGE metrics on method name generation after some adjustments on the structure. Empirically, we find the method name generation task depends on more local information than global, and the code comment generation task is in contrast. Our data and code are available at https://github.com/zc-work/GTrans.
What problem does this paper attempt to address?