Seq2Seq-AFL: Fuzzing Via Sequence-to-sequence Model

Liqun Yang,Chaoren Wei,Jian Yang,Jinxin Ma,Hongcheng Guo,Long Cheng,Zhoujun Li
DOI: https://doi.org/10.1007/s13042-024-02153-z
2024-01-01
Abstract:Fuzzing is a technique in which anomalous data is fed into software to find potential bugs. It is mainly used to discover vulnerabilities including but not limited to buffer overflows, memory leaks, and crashes when handling abnormal inputs. However, to ensure all inputs are valid in Fuzzing is infeasible in practice due to the high instrumentation overhead. Popular Fuzzers (e.g., AFL) often generate a large number of invalid mutations when performing Fuzzing, which prevents Fuzzers from discovering potential paths that lead to new crashes. More importantly, it prevents Fuzzers from making wise decisions on fuzzing operators. In this article, we propose a mutation sensitive Fuzzing solution Seq2Seq-AFL, in which mutation operator and mutation position are simultaneously taken into account, and different Seq2Seq models are designed to perform optimization scheme. The optimization scheme is capable of efficiently training a function for obtaining mutation operator and mutation position pairs, and utilizes the function to conduct Fuzzing. To verify the effectiveness of our scheme, we construct the dataset with two-dimensional vector data that corresponding to objdump, readelf, and nm programs. The experiment results demonstrate that our proposed scheme significantly improves the performance of the state-of-the-art AFL Fuzzing tool, with the coverage improvements of 13.7
What problem does this paper attempt to address?