Deep Reinforcement Learning with Reward Design for Quantum Control

Haixu Yu,Xudong Zhao
DOI: https://doi.org/10.1109/tai.2022.3225256
2024-01-01
IEEE Transactions on Artificial Intelligence
Abstract:Deep reinforcement learning (DRL) has been recognized as a powerful tool in quantum physics, where DRL's reward design is non-trivial but crucial for quantum control tasks. To address the problem of over-reliance on human empirical knowledge to design DRL's rewards, we propose a deep reinforcement learning approach with a novel reward paradigm designed by the learning process information (DRL-LPI), where the learning process information (LPI) comprises the state information and the experiences. In DRL-LPI, the state information after being classified by a fidelity threshold, and the experiences are first stored simultaneously in the respective sequences, and this process is repeated until a similar-segment ends. Then, the stored state information is converted to the real value and used to design the reward value by applying a self-amplitude function. Next, the designed reward values are integrated with the stored experiences to compose transitions for DRL's training. Through comparisons to five representative reward schemes, the proposed DRL-LPI is evaluated on two typical quantum control tasks, i.e., the spin-(1/2) quantum state control and many-coupled qubits state control, and the experimental results show the superior learning efficiency and control performance of the proposed approach. More results show that DRL-LPI exhibits the ability to learn the control strategy with few control actions compared to stochastic gradient descent (SGD) and genetic algorithm (GA).
What problem does this paper attempt to address?