Delta4Ms: Improving mutation‐based fault localization by eliminating mutant bias
Hengyuan Liu,Zheng Li,Baolong Han,Yangtao Liu,Xiang Chen,Yong Liu
DOI: https://doi.org/10.1002/stvr.1872
2024-01-18
Software Testing Verification and Reliability
Abstract:This paper introduces a novel theoretical model, Delta4Ms, that mitigates a previously unaddressed issue in mutation‐based fault localization (MBFL) known as 'mutant bias'. By integrating the principles of signal theory, we have devised Delta4Ms to adjust for the influence of mutant bias in MBFL, thereby substantially enhancing fault localization accuracy. Fault localization is a complex, costly and time‐consuming task in software debugging. Numerous automated techniques have been developed to expedite this process. Mutation‐based fault localization (MBFL) is one of the most widely studied techniques which uses mutation analysis to generate mutants for revealing potential faults in the program. However, our theoretical analysis exposes an inherent conflict between the fundamental assumption and the essential meaning of existing MBFL suspiciousness. This conflict is caused by mutant bias. Intuitively, the suspiciousness can be corrected by eliminating the mutant bias for more accurately measuring the faulty probability of the corresponding mutant statement. In this paper, we introduce Delta4Ms, a fault localization approach designed to eliminate mutant bias. Delta4Ms integrates the principles of signal theory, modelling the actual suspiciousness and mutant bias as the desired and false signal components, respectively. Based on theoretical derivation, the average suspiciousness of mutants serves as an estimate of mutant bias. Delta4Ms effectively mitigates mutant bias, extracting the desired signal and yielding corrected suspiciousness for fault localization. To precisely estimate mutant bias, higher order mutants (HOMs) are incorporated. We conduct an extensive experimental evaluation of Delta4Ms on 320 real‐fault programs from Codeflaws. The results indicate that our model significantly outperforms existing SBFL and MBFL techniques, showing a considerable improvement in fault localization effectiveness. We further assessed the robustness of Delta4Ms by examining different HOM ratios and HOM generation strategies. Moreover, Delta4Ms achieves a substantial reduction in mutation execution cost and minimal accuracy loss through the implementation of test case reduction. Finally, we perform preliminary experiments on 15 real‐fault programs from the Defects4J benchmark to assess the generalization of the model's fault localization effectiveness.
computer science, software engineering