Detecting Compiler Bugs Via a Deep Learning-Based Framework
Yixuan Tang,Zhilei Ren,He Jiang,Lei Qiao,Dong Liu,Zhide Zhou,Weiqiang Kong
DOI: https://doi.org/10.1142/s0218194022500206
IF: 1.007
2022-01-01
International Journal of Software Engineering and Knowledge Engineering
Abstract:Compiler testing is the most widely used way to assure compiler quality. However, since compilers require a large number of sophisticated test programs as inputs, the existing approaches in compiler testing still have a limited capability in generating both syntactically valid and diverse test programs. In this paper, we propose DeepGen, a deep learning-based approach to support compiler testing through the inference of a generative model for compiler inputs. First, DeepGen trains a Transformer-XL model based on a large corpus of seed programs, and uses the trained model to generate syntactically valid programs. Then, DeepGen adopts a sampling strategy in the inference phase to generate diverse test programs. Finally, DeepGen leverages differential testing on the generated programs to discover compiler bugs. We have evaluated DeepGen over two popular C++ compilers GCC and LLVM, and the results confirm the effectiveness of our approach. DeepGen detects 35.29%, 53.33%, and 187.50% more bugs than three existing approaches, i.e. DeepSmith, DeepFuzz, and Csmith, respectively. In addition, 30.43% bugs detected by DeepGen are not detected by other approaches. Furthermore, DeepGen has successfully detected 38 bugs in the latest development versions of GCC and LLVM; 21 of them have been confirmed/fixed by the developers.