Abstract:Security vulnerabilities play a vital role in network security system. Fuzzing technology is widely used as a vulnerability discovery technology to reduce damage in advance. However, traditional fuzz testing faces many challenges, such as how to mutate input seed files, how to increase code coverage, and how to bypass the format verification effectively. Therefore machine learning techniques have been introduced as a new method into fuzz testing to alleviate these challenges. This paper reviews the research progress of using machine learning techniques for fuzz testing in recent years, analyzes how machine learning improves the fuzzing process and results, and sheds light on future work in fuzzing. Firstly, this paper discusses the reasons why machine learning techniques can be used for fuzzing scenarios and identifies five different stages in which machine learning has been used. Then this paper systematically studies machine learning-based fuzzing models from five dimensions of selection of machine learning algorithms, pre-processing methods, datasets, evaluation metrics, and hyperparameters setting. Secondly, this paper assesses the performance of the machine learning techniques in existing research for fuzz testing. The results of the evaluation prove that machine learning techniques have an acceptable capability of prediction for fuzzing. Finally, the capability of discovering vulnerabilities both traditional fuzzers and machine learning-based fuzzers is analyzed. The results depict that the introduction of machine learning techniques can improve the performance of fuzzing. We hope to provide researchers with a systematic and more in-depth understanding of fuzzing based on machine learning techniques and provide some references for this field through analysis and summarization of multiple dimensions.

Reinforcement Learning-Based Fuzzing Technology.

DRLFCfuzzer: fuzzing with Deep-Reinforcement-Learning under Format Constraints

High-performance Directional Fuzzing Scheme Based on Deep Reinforcement Learning

FA-Fuzz: A Novel Scheduling Scheme Using Firefly Algorithm for Mutation-Based Fuzzing

Coverage-guided fuzzing for deep reinforcement learning systems

Effectively Generating Vulnerable Transaction Sequences in Smart Contracts with Reinforcement Learning-guided Fuzzing.

LAFuzz: Neural Network for Efficient Fuzzing

FuzzCoder: Byte-level Fuzzing Test via Large Language Model

FDFuzz: Applying Feature Detection to Fuzz Deep Learning Systems

CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation

A First Look at the Effect of Deep Learning in Coverage-guided Fuzzing

Better Pay Attention Whilst Fuzzing.

DLFuzz: Differential Fuzzing Testing of Deep Learning Systems.

Coverage-guided fuzz testing method based on reinforcement learning seed scheduling

DeFuzz: Deep Learning Guided Directed Fuzzing

Context-Aware Fuzzing for Robustness Enhancement of Deep Learning Models

CMFuzz: Context-Aware Adaptive Mutation for Fuzzers

V-Fuzz: Vulnerability-Oriented Evolutionary Fuzzing

Pafl: Adaptive Energy Allocation with Upper Confidence Bound

A systematic review of fuzzing based on machine learning techniques

Improving Grey-Box Fuzzing by Modeling Program Behavior