Mining Python fix patterns via analyzing fine-grained source code changes

Yilin Yang,Tianxing He,Yang Feng,Shaoying Liu,Baowen Xu
DOI: https://doi.org/10.1007/s10664-021-10087-1
IF: 3.762
2022-01-28
Empirical Software Engineering
Abstract:Many code changes are inherently repetitive, and researchers employ repetitiveness of the code changes to generate bug fix patterns. Automatic Program Repair (APR) can automatically detect and fix bugs, thus helping developers to improve the quality of software products. As a critical component of APR, software bug fix patterns have been revealed by existing studies to be very effective in detecting and fixing bugs in different programming languages (e.g., Java/C++); yet the fix patterns proposed by these studies can not be directly applied to improve Python programs because of syntactic incompatibilities and lack of analysis of dynamic features. In this paper, we proposed a mining approach to identify fix patterns of Python programs by extracting fine-grained bug-fixing code changes. We first collected bug reports from GitHub repository and employed the abstract syntax tree edit distance to cluster similar bug-fixing code changes to generate fix patterns. We then evaluated the effectiveness of these fix patterns by applying them to single-hunk bugs in two benchmarks (BugsInPy and QuixBugs). The results show that 13 out of 101 real bugs can be fixed without human intervention; that is, the generated bug patch is identical or semantically equivalent with developer’s patches. Also, we evaluated the fix patterns in the wild. For each complex bug, 15% of the bug code could be fixed, and 37% of the bug code could be matched by fix patterns.
computer science, software engineering
What problem does this paper attempt to address?