Identify and Update Test Cases when Production Code Changes: A Transformer-Based Approach

Xing Hu,Zhuang Liu,Xin Xia,Zhongxin Liu,Tongtong Xu,Xiaohu Yang
DOI: https://doi.org/10.1109/ase56229.2023.00165
2024-01-01
Abstract:Software testing is one of the most essential parts of the software lifecycle and requires a substantial amount of time and effort. During the software evolution, test cases should co-evolve with the production code. However, the co-evolution of test cases often fails due to tight project schedules and other reasons. Obsolete test cases improve the cost of software maintenance and may fail to reveal faults and even lead to future bugs. Therefore, it is essential to detect and update these obsolete test cases in time. In this paper, we propose a novel approach Ceprot (Co-Evolution of Production-Test Code) to identify outdated test cases and update them automatically according to changes in the production code. Ceprot consists of two stages, i.e., obsolete test identification and updating. Specifically, given a production code change and a corresponding test case, Ceprot first identifies whether the test case should be updated. If the test is identified as obsolete, Ceprot will update it to a new version of test case. To evaluate the effectiveness of the two stages, we construct two datasets. Our dataset focuses on method-level production code changes and updates on their obsolete test cases. The experimental results show that Ceprot can effectively identify obsolete test cases with precision and recall of 98.3% and 90.0%, respectively. In addition, test cases generated by Ceprot are identical to the ground truth for 12.3% of samples that are identified as obsolete by Ceprot. We also conduct dynamic evaluation and human evaluation to measure the effectiveness of the updated test cases by Ceprot. 48.0% of updated test cases can be compiled and the average coverage of updated cases is 34.2% which achieves 89% coverage improvement over the obsolete tests. We believe that this study can motivate the co-evolution of production and test code.
What problem does this paper attempt to address?