Abstract:Compilers are a kind of important software, and similar to the quality assurance of other software, compiler testing is one of the most widely-used ways of guaranteeing their quality. Compiler bugs tend to occur in compiler optimizations. Detecting optimization bugs needs to consider two main factors: 1) the optimization flags controlling the accessability of the compiler buggy code should be turned on; and 2) the test program should be able to trigger the buggy code. However, existing compiler testing approaches only consider the latter to generate effective test programs, but just run them under several pre-defined optimization levels (e.g., -O0 , -O1 , -O2 , -O3 , -Os in GCC). To better understand the influence of compiler optimizations on compiler testing, we conduct the first empirical study, and find that 1) all the bugs detected under the widely-used optimization levels are also detected under the explored optimization settings (we call a combination of optimization flags turned on for compilation an optimization setting ), while 83.54% of bugs are only detected under the latter; 2) there exist both inhibition effect and promotion effect among optimization flags for compiler testing, indicating the necessity and challenges of considering the factor of compiler optimizations in compiler testing. We then propose the first approach, called COTest , by considering both factors to test compilers. Specifically, COTest first adopts machine learning (the XGBoost algorithm) to model the relationship between test programs and optimization settings, to predict the bug-triggering probability of a test program under an optimization setting. Then, it designs a diversity augmentation strategy to select a set of diverse candidate optimization settings for prediction for a test program. Finally, Top-K optimization settings are selected for compiler testing according to the predicted bug-triggering probabilities. The experiments on GCC and LLVM demonstrate its effectiveness, especially COTest detects 17 previously unknown bugs, 11 of which have been fixed or confirmed by developers.

LLM-Based Code Generation Method for Golang Compiler Testing

LLM4VV: Developing LLM-driven testsuite for compiler validation

LLM-Assisted Code Cleaning For Training Accurate Code Generators

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Compiler Test-Program Generation via Memoized Configuration Search

Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models

Learning to accelerate compiler testing

TESTEVAL: Benchmarking Large Language Models for Test Case Generation

HITS: High-coverage LLM-based Unit Test Generation via Method Slicing

Testing the Compiler for a New-Born Programming Language: An Industrial Case Study (Experience Paper)

Boosting Compiler Testing via Compiler Optimization Exploration

When LLM-based Code Generation Meets the Software Development Process

NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers

Bias Testing and Mitigation in LLM-based Code Generation

A Survey of Modern Compiler Fuzzing

Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis

Rethinking the Influence of Source Code on Test Case Generation

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback