Evaluating seed selection for fuzzing JavaScript engines

Ming Wen,Yongcong Wang,Yifan Xia,Hai Jin
DOI: https://doi.org/10.1007/s10664-023-10340-9
2023-01-01
Abstract:JavaScript (JS), as a platform-independent programming language, remains to be the most popular language over the years. However, popular JavaScript engines that have been widely utilized by web browsers to interpret JS code, have become the most common targets for attackers. Thus ensuring the security and reliability of JS engines is significant. Fuzzing is a simple yet effective method to unveil vulnerabilities. However, existing JS fuzzers focus more on the design of effective mutation mechanisms to generate diverse and valid seeds while they often ignore the importance of the initial seed corpus selected to drive the fuzzing process. In this paper, we performed extensive experiments to systematically evaluate the impact of seed selection on fuzzing JavaScript engines. In particular, we investigate seed selections from three main dimensions, their collected sources (e.g., CVE PoCs, Regression tests, etc.), the number and sizes, as well as a set of concerned code properties. Our major findings reveal that seeds collected from different sources can cast a significant impact on the fuzzing effectiveness (i.e., CVE PoC is significantly better than the other types of seeds), and seed files containing those concerned code structures can lead existing fuzzers to achieve superior results in terms of both code coverage and unique crashes identified. Inspired by our observations, we devised a simple heuristic to prioritize JavaScript files when selecting seed corpus. Our experiments show that when driven by our selected seed corpus, the existing state-of-art fuzzer is able to achieve significantly higher code coverage and identify more crashes.
What problem does this paper attempt to address?