CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

Yang Zhao,Di Huang,Chongxiao Li,Pengwei Jin,Ziyuan Nan,Tianyun Ma,Lei Qi,Yansong Pan,Zhenxing Zhang,Rui Zhang,Xishan Zhang,Zidong Du,Qi Guo,Xing Hu,Yunji Chen

2024-07-21

Abstract:The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.

Programming Languages,Artificial Intelligence

What problem does this paper attempt to address?

The paper aims to address the complexity and high cost issues in modern processor design, particularly in the context of Verilog Hardware Description Language (HDL) coding. Specifically, the paper focuses on the challenge of how to leverage Large Language Models (LLMs) to automatically generate high-quality Verilog code. Despite their excellent performance in code generation tasks for general-purpose programming languages such as Python, existing LLMs perform poorly in generating Verilog code, primarily due to the lack of high-quality instruction-tuning datasets. Moreover, even advanced LLMs, such as GPT-3.5, face difficulties in generating Verilog code. To tackle these issues, the paper proposes the CodeV framework, an open-source instruction-tuning Verilog generation model series based on multi-level summarization techniques. Unlike the traditional method of first generating descriptions and then obtaining the corresponding code, CodeV provides Verilog code to LLMs and has them generate the corresponding natural language descriptions. Experimental results show that CodeV significantly outperforms previous best models, including both open-source and commercial models, in multiple benchmark tests, demonstrating its superior performance in Verilog code generation tasks.

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

VeriGen: A Large Language Model for Verilog Code Generation

Benchmarking Large Language Models for Automated Verilog RTL Code Generation

AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs

RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique

Advanced Large Language Model (LLM)-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

Large Language Model for Verilog Generation with Golden Code Feedback

Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks

A Multi-Expert Large Language Model Architecture for Verilog Code Generation

BetterV: Controlled Verilog Generation with Discriminative Guidance

VersiCode: Towards Version-controllable Code Generation

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework

Source Code Summarization in the Era of Large Language Models

VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

HiVeGen -- Hierarchical LLM-based Verilog Generation for Scalable Chip Design

Evaluating Large Language Models for Automatic Register Transfer Logic Generation via High-Level Synthesis

How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation

Towards LLM-Powered Verilog RTL Assistant: Self-Verification and Self-Correction

PromptV: Leveraging LLM-powered Multi-Agent Prompting for High-quality Verilog Generation

CodeT5+: Open Code Large Language Models for Code Understanding and Generation