A Machine Learning Based Approach to Reaction Rate Estimation

Matthew S. Johnson,William H. Green
DOI: https://doi.org/10.26434/chemrxiv-2022-c98gc-v2
2023-12-18
Abstract:Chemical kinetic models are vital to accurately predicting phenomena in a wide variety of fields from combustion to atmospheric chemistry to electrochemistry. However, building an accurate chemical kinetic model requires the efficient and accurate estimation of many reaction rate coefficients for many reaction classes with highly variable amounts of available training data. Current techniques for fast automatic rate estimation tend to be poorly optimized and tedious to maintain and extend. We have developed a machine learning algorithm for automatically training subgraph isomorphic decision trees (SIDT) to predict rate coefficients for arbitrary reaction types. This method is fully automatic, scalable to virtually any dataset size, human readable, can incorporate qualitative chemical knowledge from experts and provides detailed uncertainty information for estimates. The accuracy of the algorithm is tested against the state of the art rate rules scheme in the RMG-database for five selected reaction families. The SIDT method is shown to significantly improve estimation accuracy across all reaction families and considered statistics. The estimator uncertainty estimates are validated against actual errors.
Chemistry
What problem does this paper attempt to address?
The problem addressed in this paper is how to estimate chemical reaction rates quickly and accurately. Existing techniques are inefficient and difficult to maintain when constructing chemical kinetic models, especially when dealing with a large number of different types of reactions and limited training data. The paper introduces a machine learning-based Subgraph Isomorphism Decision Tree (SIDT) algorithm for predicting rate coefficients of any reaction type. This approach is fully automated, scalable, easy to understand, and incorporates expert qualitative chemical knowledge while providing detailed uncertainty information. Compared to the state-of-the-art rate rule schemes, the SIDT method significantly improves estimation accuracy on all reaction families and statistical metrics, and validates the accuracy of uncertainty estimation. Thus, this paper aims to address the problem of automated, high-precision estimation of chemical reaction rates to facilitate more accurate construction of chemical kinetic models.