Abstract:Efficient simulation of quantum circuits has become indispensable with the rapid development of quantum hardware. The primary simulation methods are based on state vectors and tensor networks. As the number of qubits and quantum gates grows larger in current quantum devices, traditional state-vector based quantum circuit simulation methods prove inadequate due to the overwhelming size of the Hilbert space and extensive entanglement. Consequently, brutal force tensor network simulation algorithms become the only viable solution in such scenarios. The two main challenges faced in tensor network simulation algorithms are optimal contraction path finding and efficient execution on modern computing devices, with the latter determines the actual efficiency. In this study, we investigate the optimization of such tensor network simulations on modern GPUs and propose general optimization strategies from two aspects: computational efficiency and accuracy. Firstly, we propose to transform critical Einstein summation operations into GEMM operations, leveraging the specific features of tensor network simulations to amplify the efficiency of GPUs. Secondly, by analyzing the data characteristics of quantum circuits, we employ extended precision to ensure the accuracy of simulation results and mixed precision to fully exploit the potential of GPUs, resulting in faster and more precise simulations. Our numerical experiments demonstrate that our approach can achieve a 3.96x reduction in verification time for random quantum circuit samples in the 18-cycle case of Sycamore, with sustained performance exceeding 21 TFLOPS on one A100. This method can be easily extended to the 20-cycle case, maintaining the same performance, accelerating by 12.5x compared to the state-of-the-art CPU-based results and 4.48-6.78x compared to the state-of-the-art GPU-based results reported in the literature.

Communication Optimizations for State-vector Quantum Simulator on CPU+GPU Clusters.

Fast scalable and low-power quantum circuit simulation on the cluster of GPUs platforms

Lazy Qubit Reordering for Accelerating Parallel State-Vector-based Quantum Circuit Simulation

MEMQSim: Highly Memory-Efficient and Modularized Quantum State-Vector Simulation

Queen: A quick, scalable, and comprehensive quantum circuit simulation for supercomputing

Quantum Computer Simulations at Warp Speed: Assessing the Impact of GPU Acceleration

Optimising Iteration Scheduling for Full-State Vector Simulation of Quantum Circuits on FPGAs

Efficient Quantum Circuit Simulation by Tensor Network Methods on Modern GPUs

Toward cost-effective quantum circuit simulation with performance tuning techniques

Mera: Memory Reduction and Acceleration for Quantum Circuit Simulation via Redundancy Exploration

Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework

Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version)

Optical experimental solution for the multiway number partitioning problem and its application to computing power scheduling

Energy Efficiency of Quantum Statevector Simulation at Scale

Parallel Simulation of Quantum Networks with Distributed Quantum State Management

Efficient techniques to GPU Accelerations of Multi-Shot Quantum Computing Simulations

Distributed Quantum Simulation

HyQuas

Ever more optimized simulations of fermionic systems on a quantum computer

Circuit Partitioning and Transmission Cost Optimization in Distributed Quantum Computing

Accelerating Decision Diagram-based Multi-node Quantum Simulation with Ring Communication and Automatic SWAP Insertion