Spotting Code Mutation for Predictive Mutation Testing

Yifan Zhao,Yizhou Chen,Zeyu Sun,Qingyuan Liang,Guoqing Wang,Dan Hao
DOI: https://doi.org/10.1145/3691620.3695491
2024-01-01
Abstract:Mutation testing is widely used to measure the test adequacy of a project. Despite its popularity, mutation testing is time-consuming and extremely expensive. To mitigate this problem, researchers propose Predictive Mutation Testing (PMT). Existing PMT approaches build classification models based on statistical program features or source code of programs to predict mutation testing results. Previous statistical feature-based PMT models need expensive overhead to collect dynamic features and neglect the rich information inherent in code text. Previous text-based PMT models extract essential code elements as input and outperform the feature-based models. However, they encode code text in a plain way. Therefore, they cannot sensitively capture subtle differences in mutants and they have difficulty in capturing the correlation between mutants and tests. To address these challenges, we propose a new model, SODA. SODA uses a new learning strategy, Mutational Semantic Learning, to make our model spot code mutation and its impact on test behavior. In particular, we employ a new sampling strategy to reinforce the corresponding relationship between mutants and tests by sampling same-mutant contrastive groups. Then we employ contrastive learning to make our model capture subtle differences in mutants. We conduct experiments to investigate the performance of SODA. The results demonstrate that both in the cross-project and cross-version scenarios, SODA achieves state-of-the-art classification performance (improves upon baselines by 5.32%-114.92% in kill-F1 score, 0.04%-25.54% in survive-F1 score, 4.25%-60.43% in accuracy) and has the lowest mutation score error.
What problem does this paper attempt to address?