Abstract:Greybox fuzzing is a powerful testing technique. Given a set of initial seeds, greybox fuzzing continuously generates new test inputs to execute the program under test and drives executions with code coverage as feedback. Seed prioritization is an important step of greybox fuzzing that helps greybox fuzzing choose promising seeds for input generation in priority. However, mainstream greybox fuzzers like AFL++ and Zest tend to neglect the importance of seed prioritization. They may pick seeds plainly according to the sequential order of the seeds being queued or an order produced with a random-based approach, which may consequently degrade their performance in exploring code and exposing bugs. In the meantime, existing state-of-the-art techniques like Alphuzz and K-Scheduler adopt complex strategies to schedule seeds. Although powerful, such strategies also inevitably incur great overhead and will reduce the scalability of the proposed technique. In this paper, we propose a novel distance-based seed prioritization approach named DiPri to facilitate greybox fuzzing. Specifically, DiPri evaluates the queued seeds according to seed distances and chooses the outlier ones, which are the farthest from the others, in priority to improve the probabilities of discovering previously unexplored code regions. To make a profound evaluation of DiPri , we prototype DiPri on AFL++ and conduct large-scale experiments with four baselines and 24 C/C++ fuzz targets, where eight are from widely adopted real-world projects, eight are from the coverage-based benchmark FuzzBench, and eight are from the bug-based benchmark Magma. The results obtained through a fuzzing exceeding 50,000 CPU hours suggest that DiPri can (1) insignificantly influence the host fuzzer’s capability of code coverage by slightly improving the branch coverage on the eight targets from real-world projects and slightly reducing the branch coverage on the eight targets from FuzzBench, and (2) improve the host fuzzer’s capability of finding bugs by triggering five more Magma bugs. Besides the evaluation with the three C/C++ benchmarks, we integrate DiPri into the Java fuzzer Zest and conduct experiments on a Java benchmark composed of five real-world programs for more than 8,000 CPU hours to empirically study the scalability of DiPri . The results with the Java benchmark demonstrate that DiPri is pretty scalable and can help the host fuzzer find bugs more consistently.

ISC4DGF: Enhancing Directed Grey-box Fuzzing with LLM-Driven Initial Seed Corpus Generation

WolfFuzz: A Dynamic, Adaptive, and Directed Greybox Fuzzer

WINDRANGER: A Directed Greybox Fuzzer driven by Deviation Basic Blocks

Fuzzing Based on Function Importance by Interprocedural Control Flow Graph

FishFuzz: Throwing Larger Nets to Catch Deeper Bugs

DeepFuzzer: Accelerated Deep Greybox Fuzzing

Smart Greybox Fuzzing

Tumbling Down the Rabbit Hole: How do Assisting Exploration Strategies Facilitate Grey-box Fuzzing?

Harnessing Large Language Models for Seed Generation in Greybox Fuzzing

AFL++: A Vulnerability Discovery and Reproduction Framework

RLTG: Multi-targets directed greybox fuzzing

DiPri : Distance-based Seed Prioritization for Greybox Fuzzing

Directed Greybox Fuzzing with Stepwise Constraint Focusing

Improving Grey-Box Fuzzing by Modeling Program Behavior

Energy distribution matters in greybox fuzzing

An Empirical Study on the Distance Metric in Guiding Directed Grey-box Fuzzing

LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

SeededFuzz: Selecting and Generating Seeds for Directed Fuzzing

FuzzCoder: Byte-level Fuzzing Test via Large Language Model

HyperGo: Probability-based Directed Hybrid Fuzzing

Selecting Initial Seeds for Better JVM Fuzzing