Evolutionary Mutation-based Fuzzing as Monte Carlo Tree Search

Yiru Zhao,Xiaoke Wang,Lei Zhao,Yueqiang Cheng,Heng Yin
DOI: https://doi.org/10.48550/arXiv.2101.00612
2021-01-03
Software Engineering
Abstract:Coverage-based greybox fuzzing (CGF) has been approved to be effective in finding security vulnerabilities. Seed scheduling, the process of selecting an input as the seed from the seed pool for the next fuzzing iteration, plays a central role in CGF. Although numerous seed scheduling strategies have been proposed, most of them treat these seeds independently and do not explicitly consider the relationships among the seeds. In this study, we make a key observation that the relationships among seeds are valuable for seed scheduling. We design and propose a "seed mutation tree" by investigating and leveraging the mutation relationships among seeds. With the "seed mutation tree", we further model the seed scheduling problem as a Monte-Carlo Tree Search (MCTS) problem. That is, we select the next seed for fuzzing by walking this "seed mutation tree" through an optimal path, based on the estimation of MCTS. We implement two prototypes, AlphaFuzz on top of AFL and AlphaFuzz++ on top of AFL++. The evaluation results on three datasets (the UniFuzz dataset, the CGC binaries, and 12 real-world binaries) show that AlphaFuzz and AlphaFuzz++ outperform state-of-the-art fuzzers with higher code coverage and more discovered vulnerabilities. In particular, AlphaFuzz discovers 3 new vulnerabilities with CVEs.
What problem does this paper attempt to address?