Transferable Neural Wavefunctions for Solids

Leon Gerard,Michael Scherbela,Halvard Sutterud,Matthew Foulkes,Philipp Grohs
2024-05-13
Abstract:Deep-Learning-based Variational Monte Carlo (DL-VMC) has recently emerged as a highly accurate approach for finding approximate solutions to the many-electron Schrödinger equation. Despite its favorable scaling with the number of electrons, $\mathcal{O}(n_\text{el}^{4})$, the practical value of DL-VMC is limited by the high cost of optimizing the neural network weights for every system studied. To mitigate this problem, recent research has proposed optimizing a single neural network across multiple systems, reducing the cost per system. Here we extend this approach to solids, where similar but distinct calculations using different geometries, boundary conditions, and supercell sizes are often required. We show how to optimize a single ansatz across all of these variations, reducing the required number of optimization steps by an order of magnitude. Furthermore, we exploit the transfer capabilities of a pre-trained network. We successfully transfer a network, pre-trained on 2x2x2 supercells of LiH, to 3x3x3 supercells. This reduces the number of optimization steps required to simulate the large system by a factor of 50 compared to previous work.
Computational Physics,Machine Learning
What problem does this paper attempt to address?
The main problem this paper attempts to address is the high computational cost associated with the application of Deep Learning Variational Monte Carlo (DL-VMC) methods in solid systems. Specifically: 1. **High computational cost**: Although DL-VMC methods have good scalability, the neural network weights need to be re-optimized each time a new system is studied, leading to very high computational costs. 2. **Calculations for different geometries, boundary conditions, and supercell sizes**: In solid-state physics, a large number of similar but different calculations are required for different geometries, boundary conditions, and supercell sizes. Traditional DL-VMC methods can only handle one system at a time, making large-scale calculations very time-consuming. 3. **Finite size effects**: To reduce finite size effects, it is necessary to study systems with larger supercells, which further increases the computational load. The paper proposes a transferable DL-VMC wave function that can share a neural network model across different geometries, boundary conditions, and supercell sizes, thereby significantly reducing the number of optimization steps required. With this method, researchers can obtain more accurate results at a lower computational cost and can more easily perform twist-averaged calculations. For example, in the lithium-hydrogen system, transferring from a pre-trained small supercell model to a larger supercell reduced the optimization steps by approximately 50 times, while also yielding more accurate results.