Automation and Machine Learning Augmented by Large Language Models in Catalysis Study

Yuming Su,Xue Wang,Yuanxiang Ye,Yibo Xie,Yujing Xu,Yibing Jiang,Cheng Wang
DOI: https://doi.org/10.1039/d3sc07012c
IF: 8.4
2024-06-27
Chemical Science
Abstract:Recent advancements in artificial intelligence and automation are transforming catalyst discovery and design from traditional trial-and-error manual mode to intelligent, high-throughput digital methodologies. This transformation is driven by four key components, including high-throughput information extraction, automated robotic experimentation, real-time feedback for iterative optimization, and interpretable machine learning for generating new knowledge. These innovations have given rise to the development of self-driving labs and significantly accelerated materials research. Over the past two years, the emergence of large language models (LLMs) has added a new dimension to this field, providing unprecedented flexibility in information integration, decision-making, and interacting with human researchers. This review explores how LLMs are reshaping catalyst design, heralding a revolutionary change in the fields.
chemistry, multidisciplinary
What problem does this paper attempt to address?
This paper discusses how to revolutionize the process of catalyst design and discovery using large-scale language models (LLMs). Traditional catalyst research relies on trial and error and manual operations, but with the advancement of artificial intelligence and automation systems, this field is transitioning into an intelligent and high-throughput digital approach. Four key components - high-throughput information extraction, automated experimental robot systems, real-time feedback iterative optimization, and interpretable machine learning - are collectively driving this transformation and giving rise to self-driving laboratories. The emergence of LLMs further enhances these technologies, enhancing the ability to integrate information, make decisions, and interact between humans and machines. The paper points out that LLMs improve the efficiency and innovation of catalysis research by processing natural language, automating code generation and data analysis, optimizing experimental design algorithms, and facilitating human-machine interaction. They are able to extract and utilize information from various unstructured data sources, which is not achievable with traditional machine learning techniques. In addition, the combination of LLMs and automated intelligent robot systems improves decision-making strategies and promotes the development of higher-level self-driven laboratories for closed-loop catalyst discovery. The paper also provides a detailed introduction to information extraction techniques, including from graphics and text, such as Optical Chemical Structure Recognition (OCSR) and Natural Language Processing (NLP). OCSR technology has evolved from early rule-based methods to hybrid methods combining machine learning, significantly improving the accuracy of structure extraction from chemical images. In terms of NLP, the application of LLMs such as SciBERT and GPT series has greatly improved entity recognition and relationship extraction in text information extraction. Overall, this paper aims to illustrate how LLMs reshape catalyst design, accelerate material research, and foreshadow revolutionary changes in this field.