Arbitrary quantum states preparation aided by deep reinforcement learning

Zhao-Wei Wang,Zhao-Ming Wang
2024-07-23
Abstract:The preparation of quantum states is essential in the realm of quantum information processing, and the development of efficient methodologies can significantly alleviate the strain on quantum resources. Within the framework of deep reinforcement learning (DRL), we integrate the initial and the target state information within the state preparation task together, so as to realize the control trajectory design between two arbitrary quantum states. Utilizing a semiconductor double quantum dots (DQDs) model, our results demonstrate that the resulting control trajectories can effectively achieve arbitrary quantum state preparation (AQSP) for both single-qubit and two-qubit systems, with average fidelities of 0.9868 and 0.9556 for the test sets, respectively. Furthermore, we consider the noise around the system and the control trajectories exhibit commendable robustness against charge and nuclear noise. Our study not only substantiates the efficacy of DRL in QSP, but also provides a new solution for quantum control tasks of multi-initial and multi-objective states, and is expected to be extended to a wider range of quantum control problems.
Quantum Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges in Arbitrary Quantum State Preparation (AQSP). Specifically, the researchers hope to develop an efficient method to design control trajectories, thereby achieving the conversion from any initial quantum state to any target quantum state. Although traditional methods such as gradient - based optimization algorithms are effective, they often require continuous pulses and may not be ideal when implemented in actual experiments. Therefore, this paper proposes a new method using Deep Reinforcement Learning (DRL) to achieve more efficient discrete control pulse design. ### Main problems and solutions 1. **Requirements for arbitrary quantum state preparation**: - In the field of quantum information processing, precise control of quantum state preparation is crucial for quantum computing and simulation. - Traditional methods such as Chopped Random Basis (CRAB) and Gradient - Ascent Pulse Engineering (GRAPE) can solve optimization problems, but usually produce approximately continuous pulses, which are not ideal when implemented in experiments. 2. **Application of deep reinforcement learning**: - Use the DRL algorithm to integrate the information of the initial quantum state and the target quantum state into a unified representation, thereby designing an effective control trajectory. - The researchers verified the effectiveness of this method through the semiconductor double - quantum - dot (DQDs) model, achieving high - fidelity quantum state preparation for single - qubit and two - qubit systems, with average fidelities of 0.9868 and 0.9556 respectively. 3. **Robustness in noisy environments**: - The research also considered the noise around the system (such as charge noise and nuclear noise) and demonstrated the robustness of the designed control trajectories in such noisy environments. ### Method overview - **Dataset construction**: Construct training, validation, and test datasets by uniformly sampling quantum states on the Bloch sphere. - **Application of DQN algorithm**: Use the Deep Q - Network (DQN) algorithm for training, and convert quantum state information into probability distributions for easy processing by machine learning. - **POVM method**: Adopt the Positive Operator - Valued Measure (POVM) method to deal with the complex number problems of density matrix elements. - **Reward function design**: Use fidelity as the reward function to guide the learning process of the DRL algorithm. ### Experimental results - Single - qubit system: In the test set, the average fidelity reached 0.9864. - Two - qubit system: In the test set, the average fidelity reached 0.9556. - Noisy environment: Under a certain range of noise intensities, the control trajectories showed good robustness. In conclusion, this paper successfully solved the key problems in AQSP by combining DRL and POVM methods, and provided a new solution for multi - objective quantum control tasks.