Neural SZZ Algorithm.

Lingxiao Tang,Lingfeng Bao,Xin Xia,Zhongdong Huang
DOI: https://doi.org/10.1109/ase56229.2023.00037
2024-01-01
Abstract:The SZZ algorithm has been widely used for identifying bug-inducing commits. However, it suffers from low precision, as not all deletion lines in the bug-fixing commit are related to the bug fix. Previous studies have attempted to address this issue by using static methods to filter out noise, e.g., comments and refactoring operations in the bug-fixing commit. However, these methods have two limitations. First, it is challenging to include all refactoring and non-essential change patterns in a tool, leading to the potential exclusion of relevant lines and the inclusion of irrelevant lines. Second, applying these tools might not always improve performance. In this paper, to address the aforementioned challenges, we propose NEURALSZZ, a deep learning approach for detecting the root cause deletion lines in a bug-fixing commit and using them as input for the SZZ algorithm. NEURALSZZ first constructs a heterogeneous graph attention network model that captures the semantic relationships between each deletion line and the other deletion and addition lines. To pinpoint the root cause of a bug, Neuralszz uses a learning-to-rank technique to rank all deletion lines in the commit. To evaluate the effectiveness of NEURALSZZ, we utilize three datasets containing high-quality bug-fixing and bug-inducing commits. The experiment results show that NEURALSZZ outperforms various baseline methods, e.g., traditional machine learning-based approaches and BI-LSTM in identifying the root cause of bugs. Moreover, by utilizing the top-ranked deletion lines and applying the SZZ algorithm, Neuralszz demonstrates better precision and F1-score compared to previous SZZ algorithms.
What problem does this paper attempt to address?