High-Throughput Condensed-Phase Hybrid Density Functional Theory for Large-Scale Finite-Gap Systems: The SeA Approach

Hsin-Yu Ko,Marcos F. Calegari Andrade,Zachary M. Sparrow,Ju-an Zhang,Robert A. DiStasio Jr
DOI: https://doi.org/10.1021/acs.jctc.2c00827
2023-07-02
Abstract:High-throughput DFT calculations are key to screening existing/novel materials, sampling potential energy surfaces, and generating quantum mechanical data for machine learning. By including a fraction of exact exchange (EXX), hybrid functionals reduce the self-interaction error in semi-local DFT and furnish a more accurate description of the underlying electronic structure, albeit at a high computational cost that often prohibits such high-throughput applications. To address this challenge, we have constructed SeA (SeA=SCDM+exx+ACE), a robust, accurate, and efficient framework for high-throughput condensed-phase hybrid DFT in the PWSCF module of Quantum ESPRESSO (QE) by combining: (1) the non-iterative selected columns of the density matrix (SCDM) orbital localization scheme, (2) a black-box and linear-scaling EXX algorithm (exx), and (3) adaptively compressed exchange (ACE). Across a diverse set of non-equilibrium (H$_2$O)$_{64}$ configurations (with densities spanning 0.4 g/cm$^3$$-$1.7 g/cm$^3$), SeA yields a one$-$two order-of-magnitude speedup (~8X$-$26X) in the overall time-to-solution compared to PWSCF(ACE) in QE (~78X$-$247X speedup compared to the conventional EXX implementation) and yields energies, ionic forces, and other properties with high fidelity. As a proof-of-principle high-throughput application, we trained a deep neural network (DNN) potential for ambient liquid water at the hybrid DFT level using SeA via an actively learned data set with ~8,700 (H$_2$O)$_{64}$ configurations. Using an out-of-sample set of (H$_2$O)$_{512}$ configurations (at non-ambient conditions), we confirmed the accuracy of this SeA-trained potential and showcased the capabilities of SeA by computing the ground-truth ionic forces in this challenging system containing > 1,500 atoms.
Materials Science,Computational Engineering, Finance, and Science
What problem does this paper attempt to address?
The paper primarily aims to address the challenges in high-throughput density functional theory (DFT) calculations for large-scale finite bandgap systems, particularly those involving condensed phase systems. Specifically, the research team developed a method called SeA (Selected Columns of the Density Matrix + exact exchange + Adaptively Compressed Exchange) to improve the efficiency of hybrid functional-based DFT calculations. The SeA method combines the following three key components: 1. **Selected Columns of the Density Matrix (SCDM)**: This is a non-iterative orbital localization scheme that can generate localized occupied orbitals without the need for complex optimization processes. 2. **Linear-scaling exact exchange (exx)**: This is a linear-scaling exact exchange algorithm that leverages the locality of orbitals to efficiently compute exchange energy and related physical quantities. 3. **Adaptively Compressed Exchange (ACE)**: This is a low-rank approximation method used to reduce the number of calls to the standard exact exchange operator during the self-consistent field (SCF) process. By integrating these three components, the SeA method can significantly reduce computational costs while maintaining calculation accuracy. Specifically, the method performed excellently in a series of tests on water molecule systems with different densities, achieving approximately 8 to 26 times speedup in single-point energy and ionic force calculations compared to traditional PWSCF implementations. Additionally, compared to the non-ACE version of the PWSCF method, the speedup reached about 78 to 247 times. Furthermore, the researchers demonstrated how to use the SeA method to generate deep neural network potential energy surfaces for large-scale water molecule systems, further validating the potential of the SeA method in high-throughput applications, particularly in material screening, potential energy surface sampling, and machine learning data generation.