Inorganic synthesis-structure maps in zeolites with machine learning and crystallographic distances

Daniel Schwalbe-Koda,Daniel E. Widdowson,Tuan Anh Pham,Vitaliy A. Kurlin
DOI: https://doi.org/10.1039/D3DD00134B
2023-07-20
Abstract:Zeolites are inorganic materials known for their diversity of applications, synthesis conditions, and resulting polymorphs. Although their synthesis is controlled both by inorganic and organic synthesis conditions, computational studies of zeolite synthesis have focused mostly on organic template design. In this work, we use a strong distance metric between crystal structures and machine learning (ML) to create inorganic synthesis maps in zeolites. Starting with 253 known zeolites, we show how the continuous distances between frameworks reproduce inorganic synthesis conditions from the literature without using labels such as building units. An unsupervised learning analysis shows that neighboring zeolites according to our metric often share similar inorganic synthesis conditions, even in template-based routes. In combination with ML classifiers, we find synthesis-structure relationships for 14 common inorganic conditions in zeolites, namely Al, B, Be, Ca, Co, F, Ga, Ge, K, Mg, Na, P, Si, and Zn. By explaining the model predictions, we demonstrate how (dis)similarities towards known structures can be used as features for the synthesis space. Finally, we show how these methods can be used to predict inorganic synthesis conditions for unrealized frameworks in hypothetical databases and interpret the outcomes by extracting local structural patterns from zeolites. In combination with template design, this work can accelerate the exploration of the space of synthesis conditions for zeolites.
Materials Science,Machine Learning,Chemical Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to reveal the relationship between the inorganic synthesis conditions of zeolites and their structures through machine learning and crystal structure distances, and predict the synthesis conditions of unachieved hypothetical frameworks**. ### Detailed Explanation: 1. **Background Problems**: - Zeolites are a kind of porous inorganic materials and have attracted much attention due to their rich polymorphism and wide applications. However, the synthesis conditions of zeolites are complex and high - dimensional, especially the selection of inorganic synthesis conditions is difficult to model. - Although significant progress has been made in controlling zeolite synthesis by organic template design, the design of inorganic synthesis conditions has not achieved the same effect. 2. **Research Objectives**: - **Establish Synthesis - Structure Relationship**: Reveal the relationship between the inorganic synthesis conditions of zeolites and their structures by using powerful crystal structure distance metrics (such as the average minimum distance, AMD). - **Explore Inorganic Synthesis Space**: Efficiently explore the inorganic synthesis space of zeolites using machine learning methods. - **Bypass the Lack of Labeled Data**: Propose a data - driven method that does not rely on manually labeled data to predict the inorganic synthesis conditions in the hypothetical zeolite database. 3. **Specific Methods**: - **Distance Metric**: Introduce the average minimum distance (AMD) based on the point distribution distance (PDD) to compare the similarities between crystal structures. AMD is not only computationally efficient but also able to distinguish all crystals. - **Unsupervised Learning**: By calculating the AMD distance matrix between known zeolites and performing hierarchical cluster analysis, it is found that zeolites with similar structures often have similar inorganic synthesis conditions. - **Supervised Learning**: Train classifiers (such as XGBoost) using the synthesis condition data in the literature to predict the inorganic synthesis conditions of a given zeolite. By interpreting the model predictions, further understand which structural features have an impact on specific synthesis conditions. 4. **Application Prospects**: - **Predict the Synthesis Conditions of Hypothetical Frameworks**: Apply the above methods to the hypothetical zeolite database to predict the possible inorganic synthesis conditions of these unachieved frameworks, thereby accelerating the exploration of new zeolite synthesis. - **Guide Synthesis Experiments**: Infer new synthesis conditions through structural similarity, providing a theoretical basis for experiments and reducing the cost of trial and error. ### Formula Representation: - **Average Minimum Distance (AMD)**: \[ AMD(A, B)=\frac{1}{N}\sum_{i = 1}^{N}\min_j\|a_i - b_j\| \] where \(A\) and \(B\) are two zeolite structures, \(N\) is the number of atoms, and \(\|a_i - b_j\|\) represents the distance between atoms. - **Shapley Value (SHAP)**: \[ SHAP(x_i)=\sum_{S\subseteq F\setminus\{i\}}\frac{|S|!(F - |S|- 1)!}{F!}[f(S\cup\{i\})-f(S)] \] where \(x_i\) is the \(i\) - th feature, \(F\) is the feature set, and \(f\) is the model prediction function. Through these methods, the paper successfully established the relationship between the zeolite structure and the inorganic synthesis conditions and demonstrated its potential in predicting the synthesis conditions of hypothetical zeolites.