Abstract:Machine learning interatomic potentials (MLIPs) have seen significant advances as efficient replacement of expensive quantum chemical calculations. Uncertainty estimations for MLIPs are crucial to quantify the additional model error they introduce and to leverage this information in active learning strategies. MLIPs that are based on Gaussian process regression provide a standard deviation as a possible uncertainty measure. An alternative approach are ensemble-based uncertainties. Although these uncertainty measures have been applied to active learning, it has rarely been studied how they correlate with the error, and it is not always clear whether active learning actually outperforms random sampling strategies. We consider GPR models with Coulomb and SOAP representations as inputs to predict potential energy surfaces and excitation energies of molecules. We evaluate, how the GPR variance and ensemble-based uncertainties relate to the error and whether model performance improves by selecting the most uncertain samples from a fixed configuration space. For the ensemble based uncertainty estimations, we find that they often do not provide any information about the error. For the GPR standard deviation, we find that often predictions with an increasing standard deviation also have an increasing systematical bias, which is not captured by the uncertainty. In these cases, selecting training samples with the highest uncertainty leads to a model with a worse test error compared to random sampling. We conclude that confidence intervals, which are derived from the predictive standard deviation, can be highly overconfident. Selecting samples with high GPR standard deviation leads to a model that overemphasizes the borders of the configuration space represented in the fixed dataset. This may result in worse performance in more densely sampled areas but better generalization for extrapolation tasks.

Spatially resolved uncertainties for machine learning potentials

Evaluation of uncertainty estimations for Gaussian process regression based machine learning interatomic potentials

On the Uncertainty Estimates of Equivariant-Neural-Network-Ensembles Interatomic Potentials

Coherent energy and force uncertainty in deep learning force fields

Robust and scalable uncertainty estimation with conformal prediction for machine-learned interatomic potentials

Efficient Generation of Stable Linear Machine-Learning Force Fields with Uncertainty-Aware Active Learning

Graph Neural Network Interatomic Potential Ensembles with Calibrated Aleatoric and Epistemic Uncertainty on Energy and Forces

Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

Deep Ensembles vs. Committees for Uncertainty Estimation in Neural-Network Force Fields: Comparison and Application to Active Learning

Uncertainty Quantification in Atomistic Simulations of Silicon using Interatomic Potentials

Uncertainty quantification by direct propagation of shallow ensembles

Characterizing Uncertainty in Machine Learning for Chemistry

Committee neural network potentials control generalization errors and enable active learning

Statistical methods for resolving poor uncertainty quantification in machine learning interatomic potentials

Evaluating uncertainty-based active learning for accelerating the generalization of molecular property prediction

Evidential Deep Learning for Interatomic Potentials

Active learning strategies for atomic cluster expansion models

Machine learning interatomic potentials for reactive hydrogen dynamics at metal surfaces based on iterative refinement of reaction probabilities

Fast and Accurate Uncertainty Estimation in Chemical Machine Learning

Enhanced sampling of robust molecular datasets with uncertainty-based collective variables

Discrepancies and error evaluation metrics for machine learning interatomic potentials