A Decision Procedure for String to Code Point Conversion

Andrew Reynolds,Andres Nötzli,Clark Barrett,Cesare Tinelli
DOI: https://doi.org/10.1007/978-3-030-51074-9_13
2020-01-01
Abstract:In text encoding standards such as Unicode, text strings are sequences of code points, each of which can be represented as a natural number. We present a decision procedure for a concatenation-free theory of strings that includes length and a conversion function from strings to integer code points. Furthermore, we show how many common string operations, such as conversions between lowercase and uppercase, can be naturally encoded using this conversion function. We describe our implementation of this approach in the SMT solver CVC4, which contains a high-performance string subsolver, and show that the use of a native procedure for code points significantly improves its performance with respect to other state-of-the-art string solvers.
What problem does this paper attempt to address?