Prediction of Diffusion Coefficient Through Machine Learning Based on Transition State Theory Descriptors

François-Xavier Coudert,Emmanuel Ren
DOI: https://doi.org/10.26434/chemrxiv-2024-h98mf-v2
2024-03-25
Abstract:Nanoporous materials serve as very effective media for storing or separating small molecules. To design the best materials for a given application based on adsorption, one usually assesses the equilibrium performance by using key thermodynamic quantities such as Henry constants or adsorption loading values. To go beyond standard methodologies, we probe here the transport effects occurring in the material by studying the self-diffusion coefficients of xenon inside the nanopores of framework materials. We find good correlations between the diffusion coefficients and the pore aperture size, as well as other geometrical and energetic descriptors. We used extensive molecular dynamics simulations to calculate the diffusion coefficient of xenon in 4,873 MOFs from the CoREMOF 2019 database, the first large-scale database of transport properties published at this scale. Based on this data, we present a tool to quickly evaluate the diffusion energy barrier that proved to be very correlated to the diffusion rate. This descriptor, alongside other geometrical characterizations, were then used to build a machine learning model that can predict the xenon diffusion coefficients in MOFs. The final trained model is quite accurate and shows a root mean square error (RMSE) on the log_{10} of the diffusion coefficient equal to 0.25.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to predict the self - diffusion coefficient of xenon in porous materials through machine - learning methods. Specifically, the researchers used transition - state - theory descriptors, combined with molecular - dynamics simulations and statistical - learning techniques, to build a model that can quickly and accurately predict the diffusion coefficient of xenon in porous materials. This research aims to overcome the high - computational - cost problem when calculating the diffusion coefficient using traditional molecular - dynamics simulations and provide a more efficient method to evaluate the transport properties in nanoporous materials, thereby accelerating the material - design and - optimization process. ### Analysis of the Core Problems in the Paper 1. **Computational - Cost Problem**: - **Background**: Although traditional molecular - dynamics (MD) simulations can accurately calculate the diffusion coefficient, the computational cost is very high. Especially when screening materials in high - throughput, a large amount of computational resources and time are required. - **Objective**: Develop a machine - learning - based method that can significantly reduce the computational time and resource requirements while ensuring the prediction accuracy. 2. **Data Generation and Feature Selection**: - **Data Generation**: The researchers used MD simulations to calculate the diffusion coefficients of xenon in 4,873 MOFs (metal - organic frameworks) and established a large - scale diffusion - coefficient database. - **Feature Selection**: In addition to the energy barrier (diffusion activation energy), other geometric and energy descriptors, such as pore - limiting diameter (PLD), adsorption enthalpy, etc., were also selected as the input features of the machine - learning model. 3. **Model Training and Validation**: - **Model Architecture**: A supervised - learning model was constructed using the XGBoost framework. - **Performance Evaluation**: The finally trained model has a root - mean - square error (RMSE) of 0.25 when predicting the logarithm of the diffusion coefficient (log10(D)), indicating that the model has high prediction accuracy. ### Key Formulas - **Einstein Equation**: \[ \langle r(t)^2 \rangle = 6 D_{\text{diff}} t \] where \(\langle r(t)^2 \rangle\) is the mean - squared displacement of xenon atoms at time \(t\), and \(D_{\text{diff}}\) is the diffusion coefficient. - **Arrhenius Equation**: \[ k_{\text{diff}} = A \exp\left(-\frac{E_a}{k_B T}\right) \] where \(k_{\text{diff}}\) is the diffusion rate, \(A\) is the pre - exponential factor, \(E_a\) is the activation energy, \(k_B\) is the Boltzmann constant, and \(T\) is the temperature. - **Relationship between Diffusion Coefficient and Activation Energy**: \[ \log(D) \propto E_a \] ### Conclusion This paper successfully established an efficient and accurate model for predicting the diffusion coefficient of xenon in porous materials by combining molecular - dynamics simulations and machine - learning techniques. This method not only significantly reduces the computational cost but also provides a powerful tool for material design and optimization.