A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction

Zach Jensen,Edward Kim,Soonhyoung Kwon,Terry Z. H. Gani,Yuriy Román-Leshkov,Manuel Moliner,Avelino Corma,Elsa Olivetti,Yuriy Román-Leshkov
DOI: https://doi.org/10.1021/acscentsci.9b00193
IF: 18.2
2019-04-19
ACS Central Science
Abstract:Zeolites are porous, aluminosilicate materials with many industrial and "green" applications. Despite their industrial relevance, many aspects of zeolite synthesis remain poorly understood requiring costly trial and error synthesis. In this paper, we create natural language processing techniques and text markup parsing tools to automatically extract synthesis information and trends from zeolite journal articles. We further engineer a data set of germanium-containing zeolites to test the accuracy of the extracted data and to discover potential opportunities for zeolites containing germanium. We also create a regression model for a zeolite's framework density from the synthesis conditions. This model has a cross-validated root mean squared error of 0.98 T/1000 Å<sup>3</sup>, and many of the model decision boundaries correspond to known synthesis heuristics in germanium-containing zeolites. We propose that this automatic data extraction can be applied to many different problems in zeolite synthesis and enable novel zeolite morphologies.
chemistry, multidisciplinary
What problem does this paper attempt to address?