Optimizing Regular Expressions via Rewrite-Guided Synthesis
Jedidiah McClurg,Miles Claver,Jackson Garner,Jake Vossen,Jordan Schmerge,Mehmet E. Belviranli
DOI: https://doi.org/10.48550/arXiv.2104.12039
2021-04-24
Programming Languages
Abstract:Regular expressions are pervasive in modern systems. Many real-world regular expressions are inefficient, sometimes to the extent that they are vulnerable to complexity-based attacks, and while much research has focused on detecting inefficient regular expressions or accelerating regular expression matching at the hardware level, we investigate automatically transforming regular expressions to remove inefficiencies. We reduce this problem to general expression optimization, an important task necessary in a variety of domains even beyond compilers, e.g., digital logic design, etc. Syntax-guided synthesis (SyGuS) with a cost function can be used for this purpose, but ordered enumeration through a large space of candidate expressions can be prohibitively expensive. Equality saturation is an alternative approach which allows efficient construction and maintenance of expression equivalence classes generated by rewrite rules, but the procedure may not reach saturation, meaning global minimality cannot be confirmed. We present a new approach called rewrite-guided synthesis (ReGiS), in which a unique interplay between SyGuS and equality saturation-based rewriting helps to overcome these problems, resulting in an efficient, scalable framework for expression optimization.