Stable and efficient differentiation of tensor network algorithms

Anna Francuz,Norbert Schuch,Bram Vanhecke
2023-11-24
Abstract:Gradient based optimization methods are the established state-of-the-art paradigm to study strongly entangled quantum systems in two dimensions with Projected Entangled Pair States. However, the key ingredient, the gradient itself, has proven challenging to calculate accurately and reliably in the case of a corner transfer matrix (CTM)-based approach. Automatic differentiation (AD), which is the best known tool for calculating the gradient, still suffers some crucial shortcomings. Some of these are known, like the problem of excessive memory usage and the divergences which may arise when differentiating a singular value decomposition (SVD). Importantly, we also find that there is a fundamental inaccuracy in the currently used backpropagation of SVD that had not been noted before. In this paper, we describe all these problems and provide them with compact and easy to implement solutions. We analyse the impact of these changes and find that the last problem -- the use of the correct gradient -- is by far the dominant one and thus should be considered a crucial patch to any AD application that makes use of an SVD for truncation.
Quantum Physics,Strongly Correlated Electrons,Computational Physics
What problem does this paper attempt to address?
This paper attempts to solve the key problems encountered when using tensor network algorithms (especially the corner transfer matrix, CTM) to study strongly - entangled quantum systems. Specifically, the paper mainly focuses on the following three aspects of problems: 1. **Excessive memory consumption**: The automatic differentiation (AD) method needs to store a large number of intermediate objects during the calculation process, which leads to excessive memory requirements. 2. **Unstable gradient calculation of singular value decomposition (SVD)**: When dealing with degenerate spectra, the gradient calculation of SVD is unstable and may even be undefined. 3. **Errors in the back - propagation formula**: An important term is missing in the existing formula for back - propagating gradients through SVD, resulting in additional approximation errors. ### Solutions In response to the above problems, the paper proposes the following solutions: 1. **Fixed - point differentiation and gauge fixing**: - To reduce memory consumption, the paper proposes to differentiate the fixed - point equations instead of differentiating all iteration steps. This requires reliable gauge fixing of the CTM results. - Two practical gauge - fixing methods are proposed to ensure element - by - element convergence of the CTM output. 2. **Eliminating instabilities in SVD gradient calculations**: - By introducing a previously ignored gauge freedom (Q - deformation), the paper solves the instability problem in SVD gradient calculations. - This method takes advantage of the hidden gauge symmetry in CTM iterations, making it possible to avoid divergent terms without changing physical properties. 3. **Correcting the SVD gradient formula**: - The paper derives a new SVD gradient formula that includes an important term that was previously ignored and takes into account the influence of the truncated spectrum. - The new formula can calculate gradients more accurately, thereby significantly improving the accuracy of the optimization results. ### Conclusions Through these improvements, the paper shows how to significantly improve the stability and efficiency of automatic - differentiation - based tensor network algorithms. In particular, using the correct gradient formula (i.e., the gradient containing the complete dP expression) has the most obvious improvement on the results, which can be achieved by simple code modifications and is considered a crucial fix in any AD - based tensor network algorithm. ### Formula summary - **Fixed - point differentiation formula**: \[ dx=\left(1-\frac{\partial f}{\partial x}\right)^{-1}\frac{\partial f}{\partial A}dA \] - **SVD gradient formula**: \[ (1 - UU^\dagger)dAV=(1 - UU^\dagger)(dUS - AdV) \] \[ (1 - VV^\dagger)dA^\dagger U=(1 - VV^\dagger)(dVS - A^\dagger dU) \] These formulas ensure the stability and accuracy of gradient calculations, thereby enhancing the overall performance of tensor network algorithms.