Similarity of Precursors in Solid-State Synthesis as Text-Mined from Scientific Literature

Tanjin He,Wenhao Sun,Haoyan Huo,Olga Kononova,Ziqin Rong,Vahe Tshitoyan,Tiago Botari,Gerbrand Ceder
DOI: https://doi.org/10.1021/acs.chemmater.0c02553
IF: 10.508
2020-08-19
Chemistry of Materials
Abstract:Collecting and analyzing the vast amount of information available in the solid-state chemistry literature may accelerate our understanding of materials synthesis. However, one major problem is the difficulty of identifying which materials from a synthesis paragraph are precursors or are target materials. In this study, we developed a two-step chemical named entity recognition model to identify precursors and targets, based on information from the context around material entities. Using the extracted data, we conducted a meta-analysis to study the similarities and differences between precursors in the context of solid-state synthesis. To quantify precursor similarity, we built a substitution model to calculate the viability of substituting one precursor with another while retaining the target. From a hierarchical clustering of the precursors, we demonstrate that the "chemical similarity" of precursors can be extracted from text data. Quantifying the similarity of precursors helps provide a foundation for suggesting candidate reactants in a predictive synthesis model.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.chemmater.0c02553?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.chemmater.0c02553</a>.Discussion on benefits of the two-step model, comparison with BERT, and utilization of synthesis time (<a class="ext-link" href="/doi/suppl/10.1021/acs.chemmater.0c02553/suppl_file/cm0c02553_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
materials science, multidisciplinary,chemistry, physical
What problem does this paper attempt to address?