History-driven Test Program Synthesis for JVM Testing

Yingquan Zhao,Zan Wang,Junjie Chen,Mengdi Liu,Mingyuan Wu,Yuqun Zhang,Lingming Zhang
DOI: https://doi.org/10.1145/3510003.3510059
2022-01-01
Abstract:Java Virtual Machine (JVM) provides the runtime environment for Java programs, which allows Java to be “write once, run anywhere”. JVM plays a decisive role in the correctness of all Java programs running on it. Therefore, ensuring the correctness and robustness of JVM implementations is essential for Java programs. To date, various techniques have been proposed to expose JVM bugs via generating potential bug-revealing test programs. However, the diversity and effectiveness of test programs generated by existing research are far from enough since they mainly focus on minor syntactic/semantic mutations. In this paper, we propose JavaTailor, the first history-driven test program synthesis technique, which synthesizes diverse test programs by weaving the ingredients extracted from JVM historical bug-revealing test programs into seed programs for covering more JVM behaviors/paths. More specifically, JavaTailor first extracts five types of code ingredients from the historical bug-revealing test programs. Then, to synthesize diverse test programs, it iteratively inserts the extracted ingredients into the seed programs and strengthens their interactions via introducing extra data dependencies between them. Finally, JavaTailor employs these synthesized test programs to differentially test JVMs. Our experimental results on popular JVM implementations (i.e., HotSpot and OpenJ9) show that JavaTailor outperforms the state-of-the-art technique in generating more diverse and effective test programs, e.g., test programs generated by JavaTailor can achieve higher JVM code coverage and detect many more unique inconsistencies than the state-of-the-art technique. Furthermore, JavaTailor has detected 10 previously unknown bugs, 6 of which have been confirmed/fixed by developers.
What problem does this paper attempt to address?