Abstract:Witnessing the advancing scale and complexity of chip design and benefiting from high-performance computation technologies, the simulation of Very Large Scale Integration (VLSI) Circuits imposes an increasing requirement for acceleration through parallel computing with GPU devices. However, the conventional parallel strategies do not fully align with modern GPU abilities, leading to new challenges in the parallelism of VLSI simulation when using GPU, despite some previous successful demonstrations of significant acceleration. In this paper, we propose a novel approach to accelerate 4-value logic timing-aware gate-level logic simulation using waveform-based GPU parallelism. Our approach utilizes a new strategy that can effectively handle the dependency between tasks during the parallelism, reducing the synchronization requirement between CPU and GPU when parallelizing the simulation on combinational circuits. This approach requires only one round of data transfer and hence achieves one-pass parallelism. Moreover, to overcome the difficulty within the adoption of our strategy in GPU devices, we design a series of data structures and tune them to dynamically allocate and store new-generated output with uncertain scale. Finally, experiments are carried out on industrial-scale open-source benchmarks to demonstrate the performance gain of our approach compared to several state-of-the-art baselines.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to accelerate Timing - aware Gate - level Logic Simulation on modern GPU devices. Specifically, the paper focuses on how to accelerate the calculation of 4 - value logic Timing - aware Gate - level Logic Simulation through waveform - based GPU parallel strategies while reducing the need for synchronization between the CPU and GPU. Traditional parallel strategies fail to fully utilize the capabilities of modern GPUs when dealing with large - scale VLSI circuit simulations, leading to new challenges. This paper proposes a new method, aiming to achieve parallel processing of the entire computational task through a single data transfer, thereby minimizing synchronization costs and improving computational efficiency. The main contributions of the paper lie in the design of new data structures that can support one - time parallel processing, significantly reducing the frequency of memory re - allocation during parallel processing and avoiding frequent CPU - GPU synchronization. In addition, the paper also proposes a waveform - based GPU parallel strategy that supports 4 - value logic and verifies the performance improvement of this method in industrial - level open - source benchmark tests through experiments. In particular, in cases where the time cost differences in parallel tasks are significant, it shows higher GPU computing resource utilization. In summary, the core problem of the paper is to effectively accelerate 4 - value logic Timing - aware Gate - level Logic Simulation, reduce synchronization overhead, and improve overall computational efficiency by using the high - performance computing capabilities of GPUs through improved data structures and parallel strategies.

Acceleration for Timing-Aware Gate-Level Logic Simulation with One-Pass GPU Parallelism

Logic Simulation Acceleration Based on GPU

Accelerating RTL Simulation with GPUs

Distributed Time, Conservative Parallel Logic Simulation on GPUs

Accelerate Logic Re-simulation on GPU via Gate/Event Parallelism and State Compression

Massively Parallel Logic Simulation with GPUs.

General-Purpose Gate-Level Simulation with Partition-Agnostic Parallelism.

CPGPUSim: A Multi-dimensional Parallel Acceleration Framework for RTL Simulation

Using GPU to accelerate a pin-based multi-level cache simulator

Accelerating GPGPU Architecture Simulation.

GPU-Accelerated Static Timing Analysis

Gpu-Accelerated Non-Linear Analog and Mixed-Signal Circuit Transient Simulation

Exploiting Parallelism in the Simulation of General Purpose Graphics Processing Unit Program

Adaptive Multidimensional Parallel Fault Simulation Framework on Heterogeneous System

GPU Acceleration in VLSI Back-end Design: Overview and Case Studies.

Accelerating Static Timing Analysis Using CPU-GPU Heterogeneous Parallelism

Fast and Scalable Gate-Level Simulation in Massively Parallel Systems

Fastlanes: An Fpga Accelerated Gpu Microarchitecture Simulator

Parallel Circuit Simulation on Multi/Many-core Systems.

Using GPU to Accelerate Cache Simulation.

GPU-based time parallel cache simulator