An Empirical Comparison Of Compiler Testing Techniques

Junjie Chen,Wenxiang Hu,Dan Hao,Yingfei Xiong,Hongyu Zhang,Lu Zhang,Bing Xie
DOI: https://doi.org/10.1145/2884781.2884878
2016-01-01
Abstract:Compilers, as one of the most important infrastructure of today's digital world, are expected to be trustworthy. Different testing techniques are developed for testing compilers automatically. However, it is unknown so far how these testing techniques compared to each other in terms of testing effectiveness: how many bugs a testing technique can find within a time limit.In this paper, we conduct a systematic and comprehensive empirical comparison of three compiler testing techniques, namely, Randomized Differential Testing (RDT), a variant of RDT Different Optimization Levels (DOL), and Equivalence Modulo Inputs (EMI). Our results show that DOL is more effective at detecting bugs related to optimization, whereas RDT is more effective at detecting other types of bugs, and the three techniques can complement each other to a certain degree.Furthermore, in order to understand why their effectiveness differs, we investigate three factors that influence the effectiveness of compiler testing, namely, efficiency, strength of test oracles, and effectiveness of generated test programs. The results indicate that all the three factors are statistically significant, and efficiency has the most significant impact.
What problem does this paper attempt to address?