VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners

Aidan Z.H. Yang,Yoshiki Takashima,Brandon Paulsen,Josiah Dodds,Daniel Kroening
2024-05-25
Abstract:Rust is a programming language that combines memory safety and low-level control, providing C-like performance while guaranteeing the absence of undefined behaviors by default. Rust's growing popularity has prompted research on safe and correct transpiling of existing code-bases to Rust. Existing work falls into two categories: rule-based and large language model (LLM)-based. While rule-based approaches can theoretically produce correct transpilations that maintain input-output equivalence to the original, they often yield unreadable Rust code that uses unsafe subsets of the Rust language. On the other hand, while LLM-based approaches typically produce more readable, maintainable, and safe code, they do not provide any guarantees about correctness. In this work, we present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness. VERT's only requirement is that there is Web Assembly compiler for the source language, which is true for most major languages. VERT first uses the Web Assembly compiler to obtain an oracle Rust program. In parallel, VERT uses an LLM to generate a readable candidate Rust program. This candidate is verified against the oracle, and if verification fails, we regenerate a new candidate transpilation until verification succeeds. We evaluate VERT by transpiling a suite of 1,394 programs taken from competitive programming style benchmarks. Combining Anthropic's Claude-2 and VERT increases Rust transpilations passing property-based testing from 31% to 54% and bounded model-checking from 1% to 42% compared to using Claude alone. In addition, we evaluate VERT's ability to generate non-trivial safe Rust on programs taken from real-world C projects that make significant use of pointers. Our results provide insights into the limitations of LLMs to write safe Rust.
Programming Languages,Software Engineering
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of safely and correctly converting existing codebases to the Rust language. Specifically, it attempts to overcome the following two main challenges: 1. **Limitations of the rule - driven approach**: - Although rule - driven conversion methods can theoretically generate correct code and maintain input - output equivalence, they often generate Rust code that is difficult to read and maintain, and this code may use the unsafe subset of the Rust language. 2. **Limitations of the large - language - model (LLM) approach**: - The LLM approach can usually generate more readable, easier - to - maintain, and safer code, but it cannot provide a formal guarantee of code correctness. The LLM may generate seemingly reasonable but actually subtly wrong code, and these errors may be difficult to find and debug. To solve these problems, the author proposes a new tool named VERT. VERT combines the advantages of the rule - driven approach and the LLM and introduces formal verification techniques to ensure that the generated Rust code is not only readable but also functionally equivalent to the original code. ### How VERT works The main workflow of VERT is as follows: 1. **Generate Oracle Rust program**: - Use the Web Assembly compiler of the source language and the rWasm tool to compile the source program into an intermediate representation (Web Assembly), and then convert it to Rust code. This process ensures that the generated Rust code is functionally equivalent to the original code, but the code may be difficult to read. 2. **Generate candidate Rust program**: - Use a large - language model (LLM) to generate a more readable candidate version of the Rust code. 3. **Verify the candidate program**: - Compare and verify the candidate Rust program generated by the LLM with the Oracle Rust program to ensure their functional equivalence. If the verification fails, regenerate a new candidate program until the verification is successful. In this way, VERT can generate high - quality, readable Rust code while ensuring its correctness and safety. ### Experimental results The author conducted extensive experimental evaluations on VERT, and the results show that: - Combining Anthropic's Claude - 2 and VERT increased the proportion of Rust code passing property tests from 31% to 54% and the proportion passing bounded model checking from 1% to 42%. - For C code in complex real - world projects, VERT can generate a relatively simple ownership model suitable for small programs (less than 36 lines of code). In conclusion, VERT significantly improves the success rate and quality of converting code in other languages to Rust code by combining the rule - driven approach, LLM, and formal verification techniques.