Machine Learning Approach to Polymerization Reaction Engineering: Determining Monomers Reactivity Ratios

Tung Nguyen,Mona Bavarian
DOI: https://doi.org/10.48550/arXiv.2301.01231
2023-01-04
Abstract:Here, we demonstrate how machine learning enables the prediction of comonomers reactivity ratios based on the molecular structure of monomers. We combined multi-task learning, multi-inputs, and Graph Attention Network to build a model capable of predicting reactivity ratios based on the monomers chemical structures.
Machine Learning,Soft Condensed Matter,Biomolecules
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to predict the reactivity ratios of monomers in copolymers by machine - learning methods**. Specifically, the author aims to develop a multi - input - multi - output model (MIMO GAN) based on the Graph Attention Network (GAN) to predict the reactivity ratios according to the molecular structures of monomers. ### Problem Background In copolymerization reactions, the reactivity ratios of monomers have an important impact on the properties of the final copolymer. Therefore, understanding these reactivity ratios is crucial for polymerization reaction engineering. However, traditional experimental methods for determining reactivity ratios are time - consuming and require a large number of repeated experiments and tests. In addition, although atomic - level simulation methods such as Density Functional Theory (DFT) can be used to predict reactivity ratios, the computational cost is very high and requires multiple calculations of different levels of theory and basis sets. ### Solution To solve these problems, the author proposes a machine - learning - based method, especially using the Graph Attention Network (GAN). This method combines multi - task learning and multi - input strategies and uses the molecular structure features of monomers to predict reactivity ratios. The specific steps include: 1. **Data Pre - processing**: Collect and clean the data set containing monomer reactivity ratios, convert it into the SMILES chemical representation, and perform standardization processing. 2. **Model Construction**: Develop a multi - input - multi - output Graph Attention Network (MIMO GAN) that can simultaneously process the SMILES representations of two monomers and their corresponding copolymers. 3. **Model Training and Evaluation**: Use the Mean Squared Error (MSE) as the loss function for model training, and evaluate the model performance through the Root Mean Squared Error (RMSE) and R² values. ### Results The research shows that the proposed MIMO GAN model can effectively capture the chemical knowledge in copolymerization reactions and accurately predict the reactivity ratios of monomers. Although there are some limitations (such as high computational and memory requirements when dealing with large graphs), the model shows good predictive ability and interpretability, providing a powerful tool for further designing new copolymers. ### Formula Presentation In the model, the key formulas of the graph attention mechanism are as follows: - Alignment operation: \[ e_{vu}=\text{leaky_relu}(W\cdot[h_v, h_u]) \] - Weight distribution: \[ a_{vu}=\text{softmax}(e_{vu})=\frac{\exp(e_{vu})}{\sum_{u\in N(v)}\exp(e_{vu})} \] - Context vector generation: \[ C_v = \text{elu}\left(\sum_{u\in N(v)}a_{vu}\cdot W\cdot h_u\right) \] where \(v\) is the target node, \(N(v)\) represents all neighbors of node \(v\), \(h_v\) and \(h_u\) are the state vectors of node \(v\) and neighbor node \(u\) respectively, and \(W\) is a learnable weight matrix. Through these formulas, the model can effectively learn chemical features from local and non - local environments, thereby improving the prediction accuracy.