EMS: History-Driven Mutation for Coverage-based Fuzzing.

Chenyang Lyu,Shouling Ji,Xuhong Zhang,Hong Liang,Binbin Zhao,Kangjie Lu,Raheem Beyah
DOI: https://doi.org/10.14722/ndss.2022.23162
2022-01-01
Abstract:—Mutation-based fuzzing is one of the most popular approaches to discover vulnerabilities in a program. To alleviate the inefficiency of mutation-based fuzzing incurred by high randomness in the mutation process, multiple solutions are developed in recent years, especially coverage-based fuzzing. They mainly employ adaptive mutation strategies or integrate constraint-solving techniques to make a good exploration of the test cases which trigger unique paths and crashes. However, they lack a fine-grained reusing of fuzzing history to construct these interesting test cases, i.e., they largely fail to properly utilize fuzzing history across different fuzzing trials. In fact, we discover that test cases in fuzzing history contain rich knowledge of the key mutation strategies that lead to the discovery of unique paths and crashes. Specifically, partial path constraint solutions implicitly carried in these mutation strategies can be reused to accelerate the discovery of new paths and crashes that share similar partial path constraints. Therefore, we first propose a lightweight and efficient Proba- bilistic Byte Orientation Model ( PBOM ) that properly captures the byte-level mutation strategies from intra- and inter-trial history and thus can effectively trigger unique paths and crashes. We then present a novel history-driven mutation framework named EMS that employs PBOM as one of the mutation operators to probabilistically provide desired mutation byte values according to the input ones. We evaluate EMS against state-of-the-art fuzzers including AFL, QSYM, MO PT , MO PT -dict, EcoFuzz, and AFL++ on 9 real world programs. The results show that EMS discovers up to 4.91 × more unique vulnerabilities than the baseline, and finds more line coverage than other fuzzers on most programs. We report all of the discovered new vulnerabilities to vendors and will open source the prototype of EMS on GitHub.
What problem does this paper attempt to address?