Efilter: An effective fault localization based on information entropy with unlabelled test cases

Yan Xiaobo,Liu Bin,Wang Shihai,An Dong,Zhu Feng,Yang Yelin
DOI: https://doi.org/10.1016/j.infsof.2021.106543
IF: 3.9
2021-06-01
Information and Software Technology
Abstract:<h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Context:</h3><p>Automatic fault localization is essential to intelligent software system. Most fault localization techniques assume the test oracle is perfect before debugging, which is hard to exist in practice. In fact, the test suite would contain a number of unlabelled test cases which have been proved to be useful in fault localization. However, due to the execution diversity, not all unlabelled test cases are suitable for fault localization. Selecting inappropriate unlabelled test cases can even weaken the fault localization efficiency.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Objective:</h3><p>To solve the problem of filtering unlabelled test cases, this work aims to construct a feasible framework to select suitable unlabelled test cases for better fault localization.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Method:</h3><p>To address this issue, an entropy-based framework Efilter is constructed to filter unlabelled test cases. In Efilter, a Statement-based entropy and Testsuite-based entropy are constructed to measure the localization uncertainty of given test suite. The unlabelled test case with less Statement-based entropy or Testsuite-based entropy compared with its threshold would be selected. Further, the feature integration strategies for both Statement-based entropy and Testsuite-based entropy are given to calculate the suspiciousness of statements.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Results:</h3><p>The Efilter efficiency is evaluated across 6 open-source programs and 3 spectrum-based fault localizations. The results reveal that Efilter can improve fault localization efficiency by 18.8% and 16.5% with the Statement-based entropy and the Testsuite-based entropy respectively compared with the strategy without Efilter from the perspective of <em>EXAM</em> score on average.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Conclusion:</h3><p>Our results indicate that the Efilter with both the Statement-based entropy and the Testsuite-based entropy can improve the fault localization in the scenario lack of test oracles, serving as an enhancement for fault localization in practice.</p>
computer science, information systems, software engineering
What problem does this paper attempt to address?