Evidential Deep Learning for Interatomic Potentials

Han Xu,Taoyong Cui,Chenyu Tang,Dongzhan Zhou,Yuqiang Li,Xiang Gao,Xingao Gong,Wanli Ouyang,Shufei Zhang,Mao Su
2024-07-19
Abstract:Machine learning interatomic potentials (MLIPs) have been widely used to facilitate large scale molecular simulations with ab initio level accuracy. However, MLIP-based molecular simulations frequently encounter the issue of collapse due to decreased prediction accuracy for out-of-distribution (OOD) data. To mitigate this issue, it is crucial to enrich the training set with active learning, where uncertainty estimation serves as an effective method for identifying and collecting OOD data. Therefore, a feasible method for uncertainty estimation in MLIPs is desired. The existing methods either require expensive computations or compromise prediction accuracy. In this work, we introduce evidential deep learning for interatomic potentials (eIP) with a physics-inspired design. Our experiments demonstrate that eIP consistently generates reliable uncertainties without incurring notable additional computational costs, while the prediction accuracy remains unchanged. Furthermore, we present an eIP-based active learning workflow, where eIP is used not only to estimate the uncertainty of molecular data but also to perform uncertainty-driven dynamics simulations. Our findings show that eIP enables efficient sampling for a more diverse dataset, thereby advancing the feasibility of MLIP-based molecular simulations.
Computational Physics
What problem does this paper attempt to address?
The paper primarily focuses on addressing the issues encountered by Machine Learning Interatomic Potentials (MLIPs) in molecular simulations, particularly on how to improve the prediction accuracy for Out-of-distribution (OOD) data and construct more effective training datasets through active learning. Specifically, the core problems addressed by the paper are: 1. **Reducing the decline in prediction accuracy caused by OOD data**: When MLIPs are applied to unseen data, the prediction accuracy often significantly drops, even leading to simulation failures. Therefore, an effective method is needed to estimate the uncertainty of predictions to identify these OOD data points. 2. **Constructing effective training datasets**: To improve the performance of MLIPs, it is necessary to continuously enrich the training dataset through active learning. This requires a reliable method to estimate uncertainty, allowing the selection of the most uncertain data points that are most likely to bring new information for re-annotation. To address the above issues, the paper proposes the "Evidential Deep Learning for Interatomic Potentials" (eIP) method. The main contributions of eIP are: - Providing a new framework for estimating uncertainty in interatomic potential models, which can generate reliable uncertainty estimates without requiring a significant additional computational cost. - Designing a model architecture that considers physical properties, including locality and equivariance, to ensure the reasonableness and effectiveness of uncertainty estimates. - Introducing a Bayesian quantile regression model to improve the handling of non-Gaussian distributed data. - Implementing an active learning workflow based on eIP, not only for uncertainty estimation but also for uncertainty-driven dynamic simulations to explore unknown atomic structures. Through experiments on small molecule and water molecule datasets, the paper demonstrates the effectiveness and efficiency of the eIP method, proving its potential in improving the performance of MLIPs.