Abstract:Compilers are one of the most fundamental software systems. Compiler testing is important for assuring the quality of compilers. Due to the crucial role of compilers, they have to be well tested. Therefore, automated compiler testing techniques (those based on randomly generated programs) tend to run a large number of test programs (which are test inputs of compilers). The cost for compilation and execution for these test programs is significant. These techniques can take a long period of testing time to detect a relatively small number of compiler bugs. That may cause many practical problems, e.g., bringing a lot of costs including time costs and financial costs, and delaying the development/release cycle. Recently, some approaches have been proposed to accelerate compiler testing by executing test programs that are more likely to trigger compiler bugs earlier according to some criteria. However, these approaches ignore an important aspect in compiler testing: different test programs may have similar test capabilities (i.e., testing similar functionalities of a compiler, even detecting the same compiler bug), which may largely discount their acceleration effectiveness if the test programs with similar test capabilities are executed all the time. Test coverage is a proper approximation to help distinguish them, but collecting coverage dynamically is infeasible in compiler testing since most test programs are generated on the fly by automatic test-generation tools like Csmith. In this paper, we propose the first method to predict test coverage statically for compilers, and then propose to prioritize test programs by clustering them according to the predicted coverage information. The novel approach to accelerating compiler testing through coverage prediction is called COP (short for COverage Prediction). Our evaluation on GCC and LLVM demonstrates that COP significantly accelerates compiler testing, achieving an average of 51.01 percent speedup in test execution time on an existing dataset including three old release versions of the compilers and achieving an average of 68.74 percent speedup on a new dataset including 12 latest release versions. Moreover, COP outperforms the state-of-the-art acceleration approach significantly by improving $17.16\%\sim 82.51\%$17.16%∼82.51% speedups in different settings on average.

Compiler Test-Program Generation via Memoized Configuration Search

History-Guided Configuration Diversification for Compiler Test-Program Generation

Boosting Compiler Testing via Compiler Optimization Exploration

Isolating Compiler Bugs by Generating Effective Witness Programs With Large Language Models

Configuring Test Generators using Bug Reports: A Case Study of GCC Compiler and Csmith

Compiler Bug Isolation Via Enhanced Test Program Mutation

Coverage Prediction for Accelerating Compiler Testing

Compiler Testing With Relaxed Memory Models

Learning to accelerate compiler testing

Directed Test Program Generation for JIT Compiler Bug Localization

Compiler Auto-tuning through Multiple Phase Learning

Boosting Compiler Testing Via Eliminating Test Programs with Long-Execution-Time

Compiler Autotuning through Multiple Phase Learning

Learning to Prioritize Test Programs for Compiler Testing

Effective Random Test Generation for Deep Learning Compilers

Testing the Compiler for a New-Born Programming Language: An Industrial Case Study (Experience Paper)

NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers

LLM-Based Code Generation Method for Golang Compiler Testing

An industrial case study in compiler testing (tool demo)

An Empirical Comparison Of Compiler Testing Techniques