Referee: A Pattern-Guided Approach for Auto Design in Compiler-Based Analyzers
Fang Lv,Hao Li,Lei Wang,Y. Liu,Huimin Cui,Jingling Xue,Xiaobing Feng
DOI: https://doi.org/10.1109/SANER48275.2020.9054849
2020-02-01
Abstract:Coding rules become more critical for security-oriented softwares, which prefer compilers as their base platforms due to simultaneous demands not only in a mature grammatical analysis, but also in compilation and optimization techniques. However, engineering such a compiler-based analyzer, exploring proper launch points before integrating hundreds of rules one by one in the frontend of compilers, is a completely manual decision-making process with heavy redundant efforts exhausted. To improve this, we introduce a novel pattern-guided approach, named Referee, to facilitate this process. Referee improves the manual approach significantly by making three advances: (1) our pattern-guided approach can significantly reduce the amount of redundant manual efforts required, (2) a twin-graph aided broadcasting process is developed to enable rule patterns to be characterized with partially developed rules, and (3) a reliable recommendation mechanism is used to pinpoint the launch point for a new rule based on the accumulated experience from handling earlier rules. We have implemented Referee in GCC 8.2 with 163 rules from SPACE-C and MISRA-C standards. Referee achieves an accuracy of 89.9% on recommendation of launch points for new rules to our GCC-based analyzer automatically when trained with 70% of all the rules. Decreasing the training data size to 60% and 50% still yields an accuracy of 87.7% and 81.5%, respectively. Therefore, Referee can significantly reduce the amount of manual efforts that would otherwise be required, with a careful selection of seeding rule patterns, providing an interesting and fruitful avenue for further research.
Computer Science