Enhancing chemical synthesis research with NLP: Word embeddings for chemical reagent identification-A case study on nano-FeCu

Dingding Cao,Mieow Kee Chan
DOI: https://doi.org/10.1016/j.isci.2024.110780
IF: 5.8
2024-08-29
iScience
Abstract:Nanoparticle synthesis is complex, influenced by multiple variables including reagent selection. This study introduces a specialized corpus focused on "Fe, Cu, synthesis" to train a domain-specific word embedding model using natural language processing (NLP) in an unsupervised environment. Evaluation metrics included average cosine similarity, visual analysis via t-distributed stochastic neighbor embedding (t-SNE), synonym analysis, and analogy reasoning analysis. Results indicate a strong correlation between learning rate and cosine similarity, with enhanced chemical specificity in the tailored model compared to general models. The framework facilitates rapid identification of potential reagents for nano-FeCu synthesis, enhancing precision in nanomaterial research. This innovative approach offers a data-driven pathway for chemical material synthesis, demonstrating significant interdisciplinary applications.
What problem does this paper attempt to address?