Coverage Prediction for Accelerating Compiler Testing

Junjie Chen,Guancheng Wang,Dan Hao,Yingfei Xiong,Hongyu Zhang,Lu Zhang,Bing Xie,Junjie Chen,Guancheng Wang,Dan Hao,Yingfei Xiong,Hongyu Zhang,Lu Zhang,Bing Xie
DOI: https://doi.org/10.1109/tse.2018.2889771
IF: 7.4
2021-02-01
IEEE Transactions on Software Engineering
Abstract:Compilers are one of the most fundamental software systems. Compiler testing is important for assuring the quality of compilers. Due to the crucial role of compilers, they have to be well tested. Therefore, automated compiler testing techniques (those based on randomly generated programs) tend to run a large number of test programs (which are test inputs of compilers). The cost for compilation and execution for these test programs is significant. These techniques can take a long period of testing time to detect a relatively small number of compiler bugs. That may cause many practical problems, e.g., bringing a lot of costs including time costs and financial costs, and delaying the development/release cycle. Recently, some approaches have been proposed to accelerate compiler testing by executing test programs that are more likely to trigger compiler bugs earlier according to some criteria. However, these approaches ignore an important aspect in compiler testing: different test programs may have similar test capabilities (i.e., testing similar functionalities of a compiler, even detecting the same compiler bug), which may largely discount their acceleration effectiveness if the test programs with similar test capabilities are executed all the time. Test coverage is a proper approximation to help distinguish them, but collecting coverage dynamically is infeasible in compiler testing since most test programs are generated on the fly by automatic test-generation tools like Csmith. In this paper, we propose the first method to predict test coverage statically for compilers, and then propose to prioritize test programs by clustering them according to the predicted coverage information. The novel approach to accelerating compiler testing through coverage prediction is called COP (short for COverage Prediction). Our evaluation on GCC and LLVM demonstrates that COP significantly accelerates compiler testing, achieving an average of 51.01 percent speedup in test execution time on an existing dataset including three old release versions of the compilers and achieving an average of 68.74 percent speedup on a new dataset including 12 latest release versions. Moreover, COP outperforms the state-of-the-art acceleration approach significantly by improving $17.16\%\sim 82.51\%$17.16%∼82.51% speedups in different settings on average.
engineering, electrical & electronic,computer science, software engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to accelerate the compiler testing process by predicting test coverage in compiler testing. Specifically, compiler testing is an important means to ensure the quality of compilers. However, traditional automated compiler testing techniques (methods based on randomly generated programs) require running a large number of test programs, which is not only time - consuming but also costly, including time and financial costs, and may delay the development / release cycle. Although existing methods have attempted to accelerate testing by preferentially executing test programs that are more likely to trigger compiler errors, these methods have ignored an important aspect that different test programs may have similar testing capabilities, that is, testing similar functions of the compiler, or even detecting the same compiler errors, which may greatly reduce the acceleration effect. To solve this problem, the paper proposes a new method - COP (COverage Prediction), which is the first static test coverage prediction method for compilers. COP predicts the coverage of each new test program on compiler modules (such as source files or methods), and clusters the test programs according to the predicted coverage information to distinguish test programs with different testing capabilities, thereby accelerating compiler testing. In addition, COP also uses the LET (Learning - based Execution Time prediction and Test program prioritization) method to predict the probability of triggering errors per unit time for each test program and selects test programs preferentially accordingly. The main contributions of the paper are as follows: 1. **Feature identification**: Three types of features (language features, operation features, and structural features) are defined, which can represent the coverage information of test programs. 2. **Coverage prediction**: A static coverage prediction method based on historical coverage information and test program features is proposed. 3. **Applying predicted coverage to accelerate compiler testing**: The COP method is proposed to accelerate compiler testing by predicting coverage information. The experimental results show that COP is significantly superior to the existing state - of - the - art method LET in accelerating compiler testing. The average test execution time is accelerated by 51.01% to 68.74%, and the speed improvements in different settings are 17.16% and 248.51% respectively.