Abstract:Knowing the effectiveness of a test suite is essential for many activities such as assessing the test adequacy of code and guiding the generation of new test cases. Mutation testing is a commonly used defect injection technique for evaluating the effectiveness of a test suite. However, it is usually computationally expensive, as a large number of mutants (buggy versions) are needed to be generated from a production code under test and executed against the test suite. In order to reduce the expensive testing cost, recent studies proposed to use supervised models to predict the effectiveness of a test suite without executing the test suite against the mutants. Nonetheless, the training of such a supervised model requires labeled data, which still depends on the costly mutant execution. Furthermore, existing models are based on traditional supervised learning techniques, which assume that the training and testing data come from the same distribution. But, in practice, software systems are subject to considerable concept drifts, i.e., the same distribution assumption usually does not hold. This can lead to inaccurate predictions of a learned supervised model on the target code as time progresses. To tackle these problems, in this paper, we propose a Coverage-Based Unsupervised Approach (CBUA) for evaluating the effectiveness of a test suite. Given a production code under test, the corresponding mutants, and a test suite, CBUA first collects the coverage information of the mutated statements in the target production code under the execution of the test suite. Then, CBUA employs coverage to estimate the probability of each mutant being alive. As such, a mutation score is computed to evaluate the test suite effectiveness and the predicted labels (i.e., killed or alive) are obtained. The whole process only requires a one-time execution of the test suite against the target production code, without involving any mutant execution and any training data. CBUA can ensure the score monotonicity property (i.e., adding test cases to a test suite does not decrease its mutation score), which may be violated by a supervised approach. The experimental results show that CBUA is very competitive with the state-of-the-art supervised approaches in prediction accuracy. In particular, CBUA is shown to be more effective in finding mutants that are covered but not killed by a test suite, which is helpful in identifying the weaknesses in the current test suite and generating new test cases accordingly. Since CBUA is an easy-to-implement approach with a low cost, we suggest that it should be used as a baseline approach for comparison when any novel prediction approach is proposed in future studies.

Boundary Sampling to Boost Mutation Testing for Deep Learning Models.

Mutation Operator Reduction for Cost-effective Deep Learning Software Testing Via Decision Boundary Change Measurement

DeepMutation: Mutation Testing of Deep Learning Systems

How Higher Order Mutant Testing Performs for Deep Learning Models: A Fine-Grained Evaluation of Test Effectiveness and Efficiency Improved from Second-Order Mutant-Classification Tuples

Mutation Operator Reduction for Deep Learning System

DevMuT: Testing Deep Learning Framework Via Developer Expertise-Based Mutation

DeepKernel: 2D-kernels clustering based mutant reduction for cost-effective deep learning model testing

DeepMetis: Augmenting a Deep Learning Test Set to Increase its Mutation Score

Test Data Generation for Mutation Testing Based on Markov Chain Usage Model and Estimation of Distribution Algorithm

CBUA: A Probabilistic, Predictive, and Practical Approach for Evaluating Test Suite Effectiveness

BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing

Mutation-Based Deep Learning Framework Testing Method in JavaScript Environment

What Are We Really Testing in Mutation Testing for Machine Learning? A Critical Reflection

Spotting Code Mutation for Predictive Mutation Testing

Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

MuNN: Mutation Analysis of Neural Networks

A Probabilistic Framework for Mutation Testing in Deep Neural Networks

An Exploratory Study on Using Large Language Models for Mutation Testing

A Novel Method of Mutation Clustering Based on Domain Analysis

Freeze-and-mutate: Abnormal Sample Identification for DL Applications Through Model Core Analysis.

Measuring Discrimination to Boost Comparative Testing for Multiple Deep Learning Models