A Prompt-Engineered Large Language Model, Deep Learning Workflow for Materials Classification

Siyu Liu,Tongqi Wen,A. S. L. Subrahmanyam Pattamatta,David J. Srolovitz
2024-03-27
Abstract:Large language models (LLMs) have demonstrated rapid progress across a wide array of domains. Owing to the very large number of parameters and training data in LLMs, these models inherently encompass an expansive and comprehensive materials knowledge database, far exceeding the capabilities of individual researcher. Nonetheless, devising methods to harness the knowledge embedded within LLMs for the design and discovery of novel materials remains a formidable challenge. We introduce a general approach for addressing materials classification problems, which incorporates LLMs, prompt engineering, and deep learning. Utilizing a dataset of metallic glasses as a case study, our methodology achieved an improvement of up to 463% in prediction accuracy compared to conventional classification models. These findings underscore the potential of leveraging textual knowledge generated by LLMs for materials especially in the common situation where datasets are sparse, thereby promoting innovation in materials discovery and design.
Materials Science
What problem does this paper attempt to address?
This paper mainly addresses the problem of utilizing large-scale language models (LLMs) for material classification. Currently, despite the fact that LLMs have a large number of parameters and training data, which contain abundant material knowledge, how to effectively utilize this knowledge for designing and discovering new materials remains a challenge. The researchers propose a general approach that combines LLMs, prompt engineering, and deep learning for material classification. By taking metallic glasses as a case study, this approach improves the prediction accuracy by 463% compared to traditional classification models. The paper points out that due to the sparsity of data in the field of materials science, this approach contributes to the innovation and design of new materials. In their study, they design a four-step workflow, including defining material classification problems, customizing prompts to extract knowledge from LLMs, training using the BERT model, and applying the model for exploring new materials or studying composition-structure-property relationships. This approach demonstrates the tremendous potential of using natural language processing and prompt engineering in materials science.