Abstract:Due to the growing complexity of modern Integrated Circuits (ICs), there is a need for automated circuit design methods. Recent years have seen rising research in hardware design language generation to facilitate the design process. In this work, we propose a Verilog generation framework, BetterV, which fine-tunes the large language models (LLMs) on processed domain-specific datasets and incorporates generative discriminators for guidance on particular design demands. The Verilog modules are collected, filtered and processed from internet to form a clean and abundant dataset. Instruct-tuning methods are specially designed to fine-tune the LLMs to understand the knowledge about Verilog. Furthermore, data are augmented to enrich the training set and also used to train a generative discriminator on particular downstream task, which leads a guidance for the LLMs to optimize the Verilog implementation. BetterV has the ability to generate syntactically and functionally correct Verilog, which can outperform GPT-4 on the VerilogEval benchmark. With the help of task-specific generative discriminator, BetterV can achieve remarkable improvement on various electronic design automation (EDA) downstream tasks, including the netlist node reduction for synthesis and verification runtime reduction with Boolean Satisfiability (SAT) solving.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to use large - language models (LLMs) to automatically generate functionally correct and well - optimized Verilog code in the field of electronic design automation (EDA). Specifically, the paper proposes a framework named BetterV, which aims to address the challenges in existing methods in the following ways: 1. **Enhancing LLMs' understanding of Verilog**: Through domain - specific instruction - tuning, enable LLMs to better understand the Verilog language and its design requirements. 2. **Generating high - quality Verilog code**: Increase the diversity and quantity of training data through data augmentation techniques, reduce the risk of overfitting, and improve the quality of the generated code. 3. **Optimizing downstream tasks**: Introduce a generative discriminator to guide LLMs to consider specific downstream task requirements, such as logic synthesis node reduction and verification runtime reduction, when generating Verilog code. ### Main contributions 1. **Pioneering application**: BetterV is the first work to apply controllable text generation techniques to engineering optimization challenges, especially in the optimization of EDA downstream tasks. 2. **Task - driven method**: BetterV is the first Verilog generation method oriented by downstream tasks. Guided by task - specific discriminators, it improves training efficiency and practical application value. 3. **Surpassing existing models**: Using pre - trained models with 6.7B/7B parameters and not relying on prompt engineering strategies, BetterV can generate Verilog code with correct grammar and functionality and outperform GPT - 4 in the VerilogEval benchmark. 4. **Data augmentation**: Propose a data augmentation method for implementing diverse specifications for Verilog, which effectively solves the problem of scarce Verilog resources. ### Experimental results In terms of functional correctness, BetterV performs better than other models on the VerilogEval benchmark, especially on test cases written by humans. The specific results are shown in the following table: | Model | VerilogEval - machine (pass@1) | VerilogEval - machine (pass@5) | VerilogEval - machine (pass@10) | VerilogEval - human (pass@1) | VerilogEval - human (pass@5) | VerilogEval - human (pass@10) | | ------ | ----------------------------- | ----------------------------- | ------------------------------ | --------------------------- | --------------------------- | --------------------------- | | GPT - 3.5 | 46.7 | 69.1 | 74.1 | 26.7 | 45.8 | 51.7 | | GPT - 4 | 60.0 | 70.6 | 73.5 | 43.5 | 55.8 | 58.9 | | CodeLlama | 43.1 | 47.1 | 47.7 | 18.2 | 22.7 | 24.3 | | DeepSeek | 52.2 | 55.4 | 56.8 | 30.2 | 33.9 | 34.9 | | CodeQwen | 46.5 | 54.9 | 56.4 | 22.5 | 26.1 | 28.0 | | ChipNeMo | 43.4 | - | - | 22.4 | - | - | | Thakur et al. | 44.0 | 52.6 | 59.2 | 30.3 | 43.9 | 49.6 |

BetterV: Controlled Verilog Generation with Discriminative Guidance

MetaHDL:inference and parameter tracing oriented domain-specific language for hardware description

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

HiVeGen -- Hierarchical LLM-based Verilog Generation for Scalable Chip Design

Benchmarking Large Language Models for Automated Verilog RTL Code Generation

AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework

Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks

VeriGen: A Large Language Model for Verilog Code Generation

Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation

Advanced Large Language Model (LLM)-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

Towards Controllable Generative Design: A Conceptual Design Generation Approach Leveraging the FBS Ontology and Large Language Models

A Multi-Expert Large Language Model Architecture for Verilog Code Generation

VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

Can EDA Tool Feedback Improve Verilog Generation by LLMs?

VerilogEval: Evaluating Large Language Models for Verilog Code Generation

AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design

VerilogReader: LLM-Aided Hardware Test Generation

Towards LLM-Powered Verilog RTL Assistant: Self-Verification and Self-Correction

Large Language Model for Verilog Generation with Golden Code Feedback