ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model

Dawei Wang,Geng Zhou,Li Chen,Dan Li,Yukai Miao

DOI: https://doi.org/10.1145/3658644.3690231

2024-09-02

Abstract:Vulnerabilities related to option combinations pose a significant challenge in software security testing due to their vast search space. Previous research primarily addressed this challenge through mutation or filtering techniques, which inefficiently treated all option combinations as having equal potential for vulnerabilities, thus wasting considerable time on non-vulnerable targets and resulting in low testing efficiency. In this paper, we utilize carefully designed prompt engineering to drive the large language model (LLM) to predict high-risk option combinations (i.e., more likely to contain vulnerabilities) and perform fuzz testing automatically without human intervention. We developed a tool called ProphetFuzz and evaluated it on a dataset comprising 52 programs collected from three related studies. The entire experiment consumed 10.44 CPU years. ProphetFuzz successfully predicted 1748 high-risk option combinations at an average cost of only \$8.69 per program. Results show that after 72 hours of fuzzing, ProphetFuzz discovered 364 unique vulnerabilities associated with 12.30\% of the predicted high-risk option combinations, which was 32.85\% higher than that found by state-of-the-art in the same timeframe. Additionally, using ProphetFuzz, we conducted persistent fuzzing on the latest versions of these programs, uncovering 140 vulnerabilities, with 93 confirmed by developers and 21 awarded CVE numbers.

Cryptography and Security

What problem does this paper attempt to address?

The paper aims to address the issue of vulnerability detection related to option combinations in software security testing. Specifically, due to the vast number of option combinations, traditional methods such as mutation or filtering techniques are inefficient in handling these combinations because they do not distinguish which combinations are more likely to contain vulnerabilities. The paper proposes a method based on large language models (LLM) called ProphetFuzz, which can automatically predict high-risk option combinations and perform fuzz testing. The main objectives of ProphetFuzz are: 1. **Improve testing efficiency**: By predicting high-risk option combinations that are more likely to contain vulnerabilities, it reduces the testing time for non-vulnerable targets. 2. **Automate the testing process**: The entire process requires no manual intervention, from document parsing to generating test commands and files, and then to executing fuzz testing. 3. **Address issues with existing methods**: Overcome problems in traditional methods such as lack of prioritization, semantic mismatches, and high dependency on experts. Through this approach, researchers hope to discover more vulnerabilities during the software testing process and improve overall testing efficiency. Experimental results show that ProphetFuzz found more unique vulnerabilities than existing technologies within the same time frame and at a lower cost.

ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model

FA-Fuzz: A Novel Scheduling Scheme Using Firefly Algorithm for Mutation-Based Fuzzing

MorFuzz: Fuzzing Processor Via Runtime Instruction Morphing Enhanced Synchronizable Co-simulation.

FuzzCoder: Byte-level Fuzzing Test via Large Language Model

V-Fuzz: Vulnerability-Oriented Evolutionary Fuzzing

V-Fuzz: Vulnerability Prediction-Assisted Evolutionary Fuzzing for Binary Programs

Fixing Security Vulnerabilities with AI in OSS-Fuzz

LLM4Fuzz: Guided Fuzzing of Smart Contracts with Large Language Models

Mopt: Optimized Mutation Scheduling For Fuzzers

Large Language Models Based Fuzzing Techniques: A Survey

Fuzzing with Quantitative and Adaptive Hot-Bytes Identification

Combining Software Metrics and Text Features for Vulnerable File Prediction

HyPFuzz: Formal-Assisted Processor Fuzzing

Vulnerability Detection Through an Adversarial Fuzzing Algorithm

PSOFuzz: Fuzzing Processors with Particle Swarm Optimization

Enhancing Black-box Compiler Option Fuzzing with LLM Through Command Feedback

Improving Grey-Box Fuzzing by Modeling Program Behavior

Homo in Machina: Improving Fuzz Testing Coverage via Compartment Analysis

EMS: History-Driven Mutation for Coverage-based Fuzzing.

Beyond Random Inputs: A Novel ML-Based Hardware Fuzzing

Industrial Oriented Evaluation of Fuzzing Techniques