Fuzzing with Quantitative and Adaptive Hot-Bytes Identification

Tai D. Nguyen,Long H. Pham,Jun Sun

2023-07-05

Abstract:Fuzzing has emerged as a powerful technique for finding security bugs in complicated real-world applications. American fuzzy lop (AFL), a leading fuzzing tool, has demonstrated its powerful bug finding ability through a vast number of reported CVEs. However, its random mutation strategy is unable to generate test inputs that satisfy complicated branching conditions (e.g., magic-byte comparisons, checksum tests, and nested if-statements), which are commonly used in image decoders/encoders, XML parsers, and checksum tools. Existing approaches (such as Steelix and Neuzz) on addressing this problem assume unrealistic assumptions such as we can satisfy the branch condition byte-to-byte or we can identify and focus on the important bytes in the input (called hot-bytes) once and for all. In this work, we propose an approach called \tool~which is designed based on the following principles. First, there is a complicated relation between inputs and branching conditions and thus we need not only an expressive model to capture such relationship but also an informative measure so that we can learn such relationship effectively. Second, different branching conditions demand different hot-bytes and we must adjust our fuzzing strategy adaptively depending on which branches are the current bottleneck. We implement our approach as an open source project and compare its efficiency with other state-of-the-art fuzzers. Our evaluation results on 10 real-world programs and LAVA-M dataset show that \tool~achieves sustained increases in branch coverage and discovers more bugs than other fuzzers.

Cryptography and Security,Software Engineering

What problem does this paper attempt to address?

### The Problem the Paper Attempts to Solve The paper aims to address a critical issue in fuzzing: how to effectively generate test inputs to cover complex branch conditions, thereby uncovering more security vulnerabilities. Specifically: 1. **Complex Branch Conditions**: Existing fuzzing tools (such as AFL) struggle to generate test inputs that satisfy complex branch conditions (such as magic number comparisons, checksum tests, and nested if statements) due to the limitations of their random mutation strategies. This results in many potential security vulnerabilities remaining undiscovered. 2. **Improvement Strategy**: The paper proposes a new method called Finch, which improves fuzzing through the following two main aspects: - Using expressive neural network models to identify "hot-bytes," which are bytes crucial for triggering specific branch conditions. - Dynamically adjusting the fuzzing strategy by selecting different hot-bytes for mutation based on the current bottleneck branches, thereby gradually narrowing the gap between the test inputs and the target branches. 3. **Experimental Results**: Through evaluations on 10 real-world programs and the LAVA-M dataset, Finch outperforms other state-of-the-art fuzzing tools in terms of branch coverage and discovering more unique crashes. In summary, the main objective of the paper is to enhance the effectiveness and efficiency of fuzzing tools in handling complex branch conditions by introducing a quantitative and adaptive hot-byte identification mechanism.

Fuzzing with Quantitative and Adaptive Hot-Bytes Identification

FuzzCoder: Byte-level Fuzzing Test via Large Language Model

Better Pay Attention Whilst Fuzzing.

FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage

LAFuzz: Neural Network for Efficient Fuzzing

NESTFUZZ: Enhancing Fuzzing with Comprehensive Understanding of Input Processing Logic

V-Fuzz: Vulnerability-Oriented Evolutionary Fuzzing

SLF: fuzzing without valid seed inputs

Not all bytes are equal: Neural byte sieve for fuzzing

Fuzzing Based on Function Importance byAttributed Call Graph

CMFuzz: Context-Aware Adaptive Mutation for Fuzzers

AMSFuzz: an Adaptive Mutation Schedule for Fuzzing

Fuzzing with Optimized Grammar-Aware Mutation Strategies

VisFuzz: understanding and intervening fuzzing with interactive visualization

Effective Fuzzing Based on Dynamic Taint Analysis

Fuzzing Based on Function Importance by Interprocedural Control Flow Graph

ShapFuzz: Efficient Fuzzing Via Shapley-Guided Byte Selection

SHFuzz: A Hybrid Fuzzing Method Assisted by Static Analysis for Binary Programs

Fuzzing: State of the Art

CAMFuzz: Explainable Fuzzing with Local Interpretation

Checksum-Aware Fuzzing Combined with Dynamic Taint Analysis and Symbolic Execution.