Acceleration for Timing-Aware Gate-Level Logic Simulation with One-Pass GPU Parallelism

Weijie Fang,Yanggeng Fu,Jiaquan Gao,Longkun Guo,Gregory Gutin,Xiaoyan Zhang
2023-04-26
Abstract:Witnessing the advancing scale and complexity of chip design and benefiting from high-performance computation technologies, the simulation of Very Large Scale Integration (VLSI) Circuits imposes an increasing requirement for acceleration through parallel computing with GPU devices. However, the conventional parallel strategies do not fully align with modern GPU abilities, leading to new challenges in the parallelism of VLSI simulation when using GPU, despite some previous successful demonstrations of significant acceleration. In this paper, we propose a novel approach to accelerate 4-value logic timing-aware gate-level logic simulation using waveform-based GPU parallelism. Our approach utilizes a new strategy that can effectively handle the dependency between tasks during the parallelism, reducing the synchronization requirement between CPU and GPU when parallelizing the simulation on combinational circuits. This approach requires only one round of data transfer and hence achieves one-pass parallelism. Moreover, to overcome the difficulty within the adoption of our strategy in GPU devices, we design a series of data structures and tune them to dynamically allocate and store new-generated output with uncertain scale. Finally, experiments are carried out on industrial-scale open-source benchmarks to demonstrate the performance gain of our approach compared to several state-of-the-art baselines.
Data Structures and Algorithms,Hardware Architecture,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to accelerate Timing - aware Gate - level Logic Simulation on modern GPU devices. Specifically, the paper focuses on how to accelerate the calculation of 4 - value logic Timing - aware Gate - level Logic Simulation through waveform - based GPU parallel strategies while reducing the need for synchronization between the CPU and GPU. Traditional parallel strategies fail to fully utilize the capabilities of modern GPUs when dealing with large - scale VLSI circuit simulations, leading to new challenges. This paper proposes a new method, aiming to achieve parallel processing of the entire computational task through a single data transfer, thereby minimizing synchronization costs and improving computational efficiency. The main contributions of the paper lie in the design of new data structures that can support one - time parallel processing, significantly reducing the frequency of memory re - allocation during parallel processing and avoiding frequent CPU - GPU synchronization. In addition, the paper also proposes a waveform - based GPU parallel strategy that supports 4 - value logic and verifies the performance improvement of this method in industrial - level open - source benchmark tests through experiments. In particular, in cases where the time cost differences in parallel tasks are significant, it shows higher GPU computing resource utilization. In summary, the core problem of the paper is to effectively accelerate 4 - value logic Timing - aware Gate - level Logic Simulation, reduce synchronization overhead, and improve overall computational efficiency by using the high - performance computing capabilities of GPUs through improved data structures and parallel strategies.