Abstract:Quantum computing holds great promise for accelerating computational tasks, but they are still not accessible. To fill this gap, quantum computing simulators have been widely used for the developing of quantum circuits and algorithms. Simulating quantum algorithms on classical computers also poses challenges due to the need for exponential memory and computational requirements. Many researchers attempted to address such challenges on different single-core, multi-core, and many-core systems, especially graphics processing units (GPUs). The diversity of CPU and GPU simulation of quantum circuits, including various CPU–GPU combinations and multiple parameters, including qubit size, memory capacity, circuit depth, GPU performance, resource heterogeneity, and load imbalance, makes it even more challenging. Finding the best configuration requires an exhaustive search in the design space, which is not possible in an acceptable time frame. Therefore, given the multitude of parameters and the analysis of influential factors, having an analytical model for selecting the proper configuration is desirable and even essential for large systems. This paper proposes a novel analytical performance model for quantum circuit simulation on a hybrid CPU–GPU platform of various sizes and parameters such as number of CPUs/GPUs, qubit size, memory capacity, quantum circuit depth, CPU/GPU performance, resource heterogeneity, and processing load. To do so, we focus on evaluating a scalable and adaptive hybrid quantum simulator in a hybrid platform with some CPUs and GPUs across multiple hosts. The model analyzes the execution time of individual GPU kernels and the impact of major micro-architecture features on performance. By employing dynamic load partitioning (DLP) and the heterogeneous multi-GPU kernel, performance bottlenecks are accurately identified, and execution time is estimated. The proposed model shows high accuracy, with a 94% accuracy compared to the experimental results on a hybrid multi-node cluster. Therefore, the proposed model provides insights into scalability, efficiency, and load balancing in hybrid parallel systems, hence supporting code optimization and development of efficient quantum algorithms and advanced quantum circuit simulation on hybrid parallel architectures.

Quantitative GPGPU Performance Model Targeting OpenCL Architecture

GPGPU Memory Estimation and Optimization Targeting OpenCL Architecture

A quantitative performance analysis model for GPU architectures

A Performance Model for General-Purpose Computation on GPU

Hybrid Performance Modeling And Analyzing Of Parallel Systems

GPU Performance Optimization Targeting OpenCL Model

An Experimental GPU Global Memory Performance Estimation and Optimization

A Unified, Hardware-Fitted, Cross-GPU Performance Model

Performance modeling of graphics processing unit application using static and dynamic analysis

An Accurate Gpu Performance Model For Effective Control Flow Divergence Optimization

Performance analysis and modeling for quantum computing simulation on distributed GPU platforms

A Performance Model for GPU Architectures That Considers On-Chip Resources: Application to Medical Image Registration

An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization.

A mechanism for balancing accuracy and scope in cross-machine black-box GPU performance modeling

Performance Modeling of OpenMP Program Based on LLVM Compilation Platform

OpenCL Performance Prediction using Architecture-Independent Features

A Performance Analysis Framework For Optimizing Opencl Applications On Fpgas

A Polyhedral Modeling Based Source-to-Source Code Optimization Framework for GPGPU

OpenCL Overview, Implementation, and Performance Comparison

Evaluation of Programming Models and Performance for Stencil Computation on Current GPU Architectures

Exploiting Parallelism in the Simulation of General Purpose Graphics Processing Unit Program