Crex: Predicting patch correctness in automated repair of C programs through transfer learning of execution semantics

Dapeng Yan,Kui Liu,Yuqing Niu,Li Li,Zhe Liu,Zhiming Liu,Jacques Klein,Tegawendé F. Bissyandé
DOI: https://doi.org/10.1016/j.infsof.2022.107043
IF: 3.9
2022-12-01
Information and Software Technology
Abstract:A significant body of automated program repair literature relies on test suites to assess the validity of generated patches. Because such oracles are weak, state-of-the-art repair tools can validate some patches that overfit the test cases but are actually incorrect. This situation has become a prime concern in APR, hindering its adoption by the industry. This work investigates execution semantic features based on micro-traces, a form of under-constrained dynamic traces. We build on transfer learning to explore function code representations that are amenable to semantic similarity computation and can therefore be leveraged for classifying patch correctness. Our Crex prototype implementation is based on the Trex framework. Experimental results on patches generated by the CoCoNut APR tool on CodeFlaws programs indicate that our approach can yield high accuracy in predicting patch correctness. The learned embeddings were proven to capture semantic similarities between functions, which was instrumental in training a classifier that identifies patch correctness by learning to discriminate between correctly patched code and incorrectly patched code based on their semantic similarity with the buggy function.
computer science, information systems, software engineering
What problem does this paper attempt to address?