Uncertainty Quantification in Machine Learning for Glass Transition Temperature Prediction of Polymers

Hao Tang,Tianle Yue,Ying Li
DOI: https://doi.org/10.26434/chemrxiv-2024-7lggj
2024-07-12
Abstract:Machine learning (ML) has become an important technique in materials science, markedly accelerating the discovery and design of novel materials, and concurrently lowering the burden of experimental costs. Uncertainty quantification (UQ) plays a pivotal role in the accurate prediction and innovative design of novel materials through ML techniques. In this study, we perform a comprehensive evaluation of six UQ methods in ML, including ensemble, Gaussian process regression (GPR), Monte Carlo dropout (MCD), Mean-variance estimation (MVE), Bayesian neural network (BNN) and Evidential deep learning (EDL), for predictions on the glass transition temperature (T_g) of polymers. We assess the accuracy and performance of these UQ methods using three metrics, including Spearman’s rank correlation coefficient, calibration and sparsification, offering a substantial reference for data-driven polymer design. Our analysis encompasses test data, out-of-distribution data from experiments and molecular dynamics simulations, and high-T_g polymer data for UQ analysis of ML predictions. The results indicate that ML models are robust and effective in predicting polymer’s T_g values for testing and experimental data. However, correlating actual errors with uncertainties (standard deviations) poses a significant challenge, with ML models frequently exhibiting overconfidence with low uncertainties. Moreover, the accuracy of ML predictions improves when the data with large uncertainties are excluded, suggesting a potential strategy for refining ML model’s performance.
Chemistry
What problem does this paper attempt to address?