CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

Yang Zhao,Di Huang,Chongxiao Li,Pengwei Jin,Ziyuan Nan,Tianyun Ma,Lei Qi,Yansong Pan,Zhenxing Zhang,Rui Zhang,Xishan Zhang,Zidong Du,Qi Guo,Xing Hu,Yunji Chen
2024-07-21
Abstract:The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.
Programming Languages,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the complexity and high cost issues in modern processor design, particularly in the context of Verilog Hardware Description Language (HDL) coding. Specifically, the paper focuses on the challenge of how to leverage Large Language Models (LLMs) to automatically generate high-quality Verilog code. Despite their excellent performance in code generation tasks for general-purpose programming languages such as Python, existing LLMs perform poorly in generating Verilog code, primarily due to the lack of high-quality instruction-tuning datasets. Moreover, even advanced LLMs, such as GPT-3.5, face difficulties in generating Verilog code. To tackle these issues, the paper proposes the CodeV framework, an open-source instruction-tuning Verilog generation model series based on multi-level summarization techniques. Unlike the traditional method of first generating descriptions and then obtaining the corresponding code, CodeV provides Verilog code to LLMs and has them generate the corresponding natural language descriptions. Experimental results show that CodeV significantly outperforms previous best models, including both open-source and commercial models, in multiple benchmark tests, demonstrating its superior performance in Verilog code generation tasks.