Regression test prioritization leveraging source code similarity with tree kernels

Francesco Altiero,Anna Corazza,Sergio Di Martino,Adriano Peron,Luigi Libero Lucio Starace
DOI: https://doi.org/10.1002/smr.2653
2024-02-17
Journal of Software Evolution and Process
Abstract:Not all changes to a codebase are equal: Some modifications (e.g., heavy refactoring) are more critical than others (e.g., renaming local variables). In this paper, we present two regression test prioritization techniques, namely, method‐level tree kernel prioritization (MTK) and method‐level tree kernel with quotient set (MTK‐QS), leveraging tree kernel functions to effectively measure the structural similarity of changed methods and directing testing efforts towards code affected by more critical changes. Our experiments show that these techniques can significantly improve the fault detection rate than other traditional and widely used approaches. Regression test prioritization (RTP) is an active research field, aiming at re‐ordering the tests in a test suite to maximize the rate at which faults are detected. A number of RTP strategies have been proposed, leveraging different factors to reorder tests. Some techniques include an analysis of changed source code, to assign higher priority to tests stressing modified parts of the codebase. Still, most of these change‐based solutions focus on simple text‐level comparisons among versions. We believe that measuring source code changes in a more refined way, capable of discriminating between mere textual changes (e.g., renaming of a local variable) and more structural changes (e.g., changes in the control flow), could lead to significant benefits in RTP, under the assumption that major structural changes are also more likely to introduce faults. To this end, we propose two novel RTP techniques that leverage tree kernels (TK), a class of similarity functions largely used in Natural Language Processing on tree‐structured data. In particular, we apply TKs to abstract syntax trees of source code, to more precisely quantify the extent of structural changes in the source code, and prioritize tests accordingly. We assessed the effectiveness of the proposals by conducting an empirical study on five real‐world Java projects, also used in a number of RTP‐related papers. We automatically generated, for each considered pair of software versions (i.e., old version, new version) in the evolution of the involved projects, 100 variations with artificially injected faults, leading to over 5k different software evolution scenarios overall. We compared the proposed prioritization approaches against well‐known prioritization techniques, evaluating both their effectiveness and their execution times. Our findings show that leveraging more refined code change analysis techniques to quantify the extent of changes in source code can lead to relevant improvements in prioritization effectiveness, while typically introducing negligible overheads due to their execution.
computer science, software engineering
What problem does this paper attempt to address?