DeepLink: A Code Knowledge Graph Based Deep Learning Approach for Issue-Commit Link Recovery

Rui Xie,Long Chen,Wei Ye,Zhiyu Li,Tianxiang Hu,Dongdong Du,Shikun Zhang
DOI: https://doi.org/10.1109/saner.2019.8667969
2019-01-01
Abstract:Links between issue reports and corresponding code commits to fix them can greatly reduce the maintenance costs of a software project. More often than not, however, these links are missing and thus cannot be fully utilized by developers. Current practices in issue-commit link recovery extract text features and code features in terms of textual similarity from issue reports and commit logs to train their models. These approaches are limited since semantic information could be lost. Furthermore, few of them consider the effect of source code files related to a commit on issue-commit link recovery, let alone the semantics of code context. To tackle these problems, we propose to construct code knowledge graph of a code repository and generate embeddings of source code files to capture the semantics of code context. We also use embeddings to capture the semantics of issue- or commit-related text. Then we use these embeddings to calculate semantic similarity and code similarity using a deep learning approach before training a SVM binary classification model with additional features. Evaluations on real-world projects show that our approach DeepLink can outperform the state-of-the-art method.
What problem does this paper attempt to address?