Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA

Lorenzo Borella,Alberto Coppi,Jacopo Pazzini,Andrea Stanco,Marco Trenti,Andrea Triossi,Marco Zanetti
2024-09-25
Abstract:Tensor Networks (TNs) are a computational paradigm used for representing quantum many-body systems. Recent works have shown how TNs can also be applied to perform Machine Learning (ML) tasks, yielding comparable results to standard supervised learning techniques. In this work, we study the use of Tree Tensor Networks (TTNs) in high-frequency real-time applications by exploiting the low-latency hardware of the Field-Programmable Gate Array (FPGA) technology. We present different implementations of TTN classifiers, capable of performing inference on classical ML datasets as well as on complex physics data. A preparatory analysis of bond dimensions and weight quantization is realized in the training phase, together with entanglement entropy and correlation measurements, that help setting the choice of the TTN architecture. The generated TTNs are then deployed on a hardware accelerator; using an FPGA integrated into a server, the inference of the TTN is completely offloaded. Eventually, a classifier for High Energy Physics (HEP) applications is implemented and executed fully pipelined with sub-microsecond latency.
High Energy Physics - Experiment,Machine Learning,Quantum Physics
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to explore how to utilize the low - latency hardware characteristics of Field - Programmable Gate Array (FPGA) technology to implement a quantum - inspired machine - learning predictor based on Tree Tensor Network (TTN) in high - frequency real - time applications. Specifically, the paper addresses the following key issues: 1. **Low - latency Inference**: By deploying TTN on FPGA, ultra - low - latency machine - learning inference is achieved. This is very important for application scenarios that require rapid decision - making, such as the trigger pipeline in high - energy physics experiments. 2. **Resource Optimization**: It has been studied how to effectively implement the TTN architecture under the limited hardware resources of FPGA. This includes pre - analysis of bond dimension and weight quantization to select an appropriate TTN architecture and reduce resource consumption. 3. **Efficient Parallel Computation**: Different levels of parallelization strategies (fully parallel FP and partially parallel PP) have been explored to optimize tensor contraction operations on FPGA, thereby improving computational efficiency and reducing latency. 4. **Model Interpretability**: By measuring the entanglement entropy in TTN and the quantum correlations between features, an explanation of the internal information distribution and decision - making process of the model is provided, which helps to understand the working mechanism of the model and provides a basis for feature selection. 5. **Practical Application Verification**: A classifier for high - energy physics (HEP) applications has been implemented, and its fully pipelined execution on FPGA has been demonstrated, achieving a sub - microsecond - level prediction latency. ### Formula Summary - **Tensor Contraction Formula**: \[ z_i=\sum_{j} \sum_{k} x_j y_k V_{ijk} \] where \(x\) and \(y\) are input vectors, and \(V\) is a third - order tensor in TTN. - **Two - Site Spin Correlation**: \[ C_{i,j}=\frac{\langle\Psi_{\text{TTN}}|\sigma_z^i \sigma_z^j|\Psi_{\text{TTN}}\rangle}{\langle\Psi_{\text{TTN}}|\Psi_{\text{TTN}}\rangle} \] where \(\sigma_z\) is the Pauli operator and \(\Psi_{\text{TTN}}\) is the wave function corresponding to TTN. - **Two - Body Entanglement Entropy**: \[ S(\rho_A)=-\text{Tr}[\rho_A \log \rho_A]=-\text{Tr}[\rho_B \log \rho_B]=S(\rho_B) \] where \(\rho_A = \text{Tr}_B[\rho_{AB}]\), \(\rho_B=\text{Tr}_A[\rho_{AB}]\), and \(\rho_{AB}=|\Psi_{AB}\rangle\langle\Psi_{AB}|\) is the density matrix of the system. Through these methods and formulas, the paper shows how to implement an efficient, low - latency TTN machine - learning predictor on FPGA and apply it in fields such as high - energy physics.