Accelerating Parallel First-Principles Excited-State Calculation by Low-Rank Approximation with K-Means Clustering
Qingcai Jiang,Junshi Chen,Lingyun Wan,Xinming Qin,Jielan Li,Jie Liu,Hong An,Wei Hu,Jinlong Yang
DOI: https://doi.org/10.1145/3545008.3545092
2022-01-01
Abstract:First-principles time-dependent density functional theory (TDDFT) is a powerful tool to accurately describe the excited-state properties of molecules and solids in condensed matter physics, computational chemistry and materials science. However, a perceived drawback in TDDFT calculations is its ultrahigh computational cost O(N-5 similar to N-6) and large memory usage O(N-4) especially for plane-wave basis set, confining its applications to large systems containing thousands of atoms. Here, we present a massively parallel implementation of linear-response TDDFT (LR-TDDFT) and reduce the complexity to O(N-3) by combining K-Means clustering based low-rank approximation with iterative eigensolve algorithm. Furthermore, we carefully design the parallel data and task distribution schemes to accommodate with the physical nature in different steps of the computation, also, several optimization methods are employed to effectively handle the matrix operations and data communications of constructing and diagonalizing the LR-TDDFT Hamiltonian. In particular, our method can significantly reduce the cost of computation and memory by nearly 2 orders of magnitude compared to conventional LR-TDDFT calculations. Numerical results demonstrate that our implementation can gain an overall speedup of 10x and efficiently scale up to 12,288 CPU cores for large systems up to 4,096 atoms within dozens of seconds.