On the Reliability of Coverage Data for Fault Localization

Huan Xie,Maojin Li,Yan Lei,Jian Hu,Shanshan Li,Xiaoguang Mao,Yue Yu
DOI: https://doi.org/10.1109/apsec60848.2023.00016
2023-01-01
Abstract:The high quality of input data serves as the foundation for various tasks. Inaccurate data may decrease the effectiveness of elaborate algorithms and significantly impact the output. This also applies to fault localization, as accurate and reliable data is crucial for effective fault localization techniques. Many fault localization techniques analyze the coverage information for detecting bug positions. However, the source coverage data suffers from various problems, such as the imbalanced data and the coincidental correctness. These problems make the source coverage data unreliable for fault localization. To mitigate the potential adverse effect of these unreliable factors, we propose Orlando, a cOveRage-based decoupLing And recoNstructingData apprOach for fault localization. Or-landooptimizes the coverage data by synthesizing passing coverage with less coincidental correctness and failing coverage with more balanced data. The reconstructed data can provide more reliable source data for fault localization. We evaluate Orlando using the widely used Defects4J benchmark and demonstrate its effectiveness in improving two spectrum-based and two deep learning-based methods. Furthermore, Orlando outperforms state-of-the-art data optimization approaches in fault localization.
What problem does this paper attempt to address?