History-driven Compiler Fuzzing Via Assembling and Scheduling Bug-triggering Code Segments

Zhenye Fan,Guixin Ye,Tianmin Hu,Zhanyong Tang
DOI: https://doi.org/10.1109/issre62328.2024.00040
2024-01-01
Abstract:History-driven testing techniques have been proven to be an effective method for detecting compiler bugs. It employs fuzzing history (e.g., historical test cases or historical execution information) to guide to generate valid test cases. However, prior methods either have an inefficient capability in synthesizing bug-triggering test cases or suffer from a plateau of code coverage, causing a low bug-exposing ability. This paper presents ASMFUZZ, another history-driven compiler testing framework by applying a multi-metric hybrid scheduling strategy. Specifically, ASMFUZZ first extracts the bug-triggering code segments from the historical test cases that triggered bugs. The extracted bug-triggering code segments are then used to assemble new test cases. To ensure the correctness of the newly synthesized test cases, ASMFUZZ always selects the segments with code context dependencies for assembly. Duration assembly, the ingredients to be assembled are determined based on multiple feedback metrics (e.g., anomalous behaviors and code coverage). To do so, we proposed a multi-metric hybrid scheduling scheme to select optimal code segments in each testing iteration. This contributes to continuously covering deep code branches of compiler duration whole testing process, avoiding getting stuck in the plateau of code coverage. We evaluated ASMFUZZ on three mainstream JVMs including OpenJ9, HotSpot, and GraalVM involving six JDK versions. Within a 72-hour concurrent test run, ASMFUZZ exposed 16 previously unknown unique bugs, of which 11 have been confirmed by the developers. We also compared ASMFUZZ to four prior state-of-the-art fuzzers. ASMFUZZ uncovers 1.6~2.2× more bugs than comparative baselines.
What problem does this paper attempt to address?