Abstract:Quantum 8, 1265 (2024). https://doi.org/10.22331/q-2024-02-22-1265 In this work, quantum transformers are designed and analysed in detail by extending the state-of-the-art classical transformer neural network architectures known to be very performant in natural language processing and image analysis. Building upon the previous work, which uses parametrised quantum circuits for data loading and orthogonal neural layers, we introduce three types of quantum transformers for training and inference, including a quantum transformer based on compound matrices, which guarantees a theoretical advantage of the quantum attention mechanism compared to their classical counterpart both in terms of asymptotic run time and the number of model parameters. These quantum architectures can be built using shallow quantum circuits and produce qualitatively different classification models. The three proposed quantum attention layers vary on the spectrum between closely following the classical transformers and exhibiting more quantum characteristics. As building blocks of the quantum transformer, we propose a novel method for loading a matrix as quantum states as well as two new trainable quantum orthogonal layers adaptable to different levels of connectivity and quality of quantum computers. We performed extensive simulations of the quantum transformers on standard medical image datasets that showed competitively, and at times better performance compared to the classical benchmarks, including the best-in-class classical vision transformers. The quantum transformers we trained on these small-scale datasets require fewer parameters compared to standard classical benchmarks. Finally, we implemented our quantum transformers on superconducting quantum computers and obtained encouraging results for up to six qubit experiments.

Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention

Quantum Vision Transformers

Quantum linear algebra is all you need for Transformer architectures

Quantum Vision Transformers for Quark-Gluon Classification

Learning the expressibility of quantum circuit ansatz using transformer

Transformer Models for Quantum Gate Set Tomography

Transformer variational wave functions for frustrated quantum spin systems

Unified Quantum State Tomography and Hamiltonian Learning Using Transformer Models: A Language-Translation-Like Approach for Quantum Systems

Transformer neural networks and quantum simulators: a hybrid approach for simulating strongly correlated systems

Hybrid Quantum Vision Transformers for Event Classification in High Energy Physics

A Novel Spatial-Temporal Variational Quantum Circuit to Enable Deep Learning on NISQ Devices

Tensor-network-assisted variational quantum algorithm

Quantum Embedding with Transformer for High-dimensional Data

End-to-End Quantum Vision Transformer: Towards Practical Quantum Speedup in Large-Scale Models

Quanvolutional Neural Networks: Powering Image Recognition with Quantum Circuits

Leveraging Pre-Trained Neural Networks to Enhance Machine Learning with Variational Quantum Circuits

Quixer: A Quantum Transformer Model

Quantformer: Learning Extremely Low-precision Vision Transformers

GQWformer: A Quantum-based Transformer for Graph Representation Learning

Quantum feedback control with a transformer neural network architecture

Quantum-Train with Tensor Network Mapping Model and Distributed Circuit Ansatz