Materials science in the era of large language models: a perspective

Ge Lei,Ronan Docherty,Samuel J. Cooper

2024-03-12

Abstract:Large Language Models (LLMs) have garnered considerable interest due to their impressive natural language capabilities, which in conjunction with various emergent properties make them versatile tools in workflows ranging from complex code generation to heuristic finding for combinatorial problems. In this paper we offer a perspective on their applicability to materials science research, arguing their ability to handle ambiguous requirements across a range of tasks and disciplines mean they could be a powerful tool to aid researchers. We qualitatively examine basic LLM theory, connecting it to relevant properties and techniques in the literature before providing two case studies that demonstrate their use in task automation and knowledge extraction at-scale. At their current stage of development, we argue LLMs should be viewed less as oracles of novel insight, and more as tireless workers that can accelerate and unify exploration across domains. It is our hope that this paper can familiarise material science researchers with the concepts needed to leverage these tools in their own research.

Materials Science,Computation and Language

What problem does this paper attempt to address?

This paper explores the potential application of large language models (LLMs) in materials science research. The authors point out that LLMs can be powerful tools for researchers due to their ability to handle ambiguous requests and their utility across multiple tasks and disciplines. The paper showcases the applications of LLMs in automated task execution and large-scale knowledge extraction through two case studies, such as 3D microstructure analysis and extracting micrograph labels from papers. Although LLMs are not currently seen as a source of novel insights, they can accelerate and unify interdisciplinary explorations. The paper first introduces the fundamentals of LLMs, including attention mechanisms, the Transformer architecture, and the concepts of pre-training and language modeling. It then discusses the capabilities of LLMs in research, such as intrinsic and extrinsic properties like optimization response, chain thinking reasoning, self-reflection, multimodal processing, programming skills, and existing knowledge in the materials science domain. The paper also mentions the applications of LLMs in error correction, programming tasks, and multimodal data augmentation. Lastly, the paper proposes potential ways for LLMs to work within materials science workflows, such as retrieval-enhanced generation, tool utilization and manufacturing, and task integration. These approaches can reduce biases, improve interpretability, and update databases with the latest information. The authors believe that combining LLMs with traditional workflows can bring about transformation in automated laboratories or pilot production lines in materials science. However, the paper also highlights challenges in using LLMs, such as errors, costs, and the depth of understanding.

Materials science in the era of large language models: a perspective

From Text to Insight: Large Language Models for Materials Science Data Extraction

Are LLMs Ready for Real-World Materials Discovery?

Exploring large language models for microstructure evolution in materials

NLP meets Materials Science: Quantifying the presentation of materials data in scientific literature

An Interdisciplinary Outlook on Large Language Models for Scientific Research

A Prompt-Engineered Large Language Model, Deep Learning Workflow for Materials Classification

Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT

LLMs for science: Usage for code generation and data analysis

Evaluating the Performance and Robustness of LLMs in Materials Science Q&A and Property Predictions

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery

Scientific Large Language Models: A Survey on Biological & Chemical Domains

Towards Efficient Large Language Models for Scientific Text: A Review

Beyond designer's knowledge: Generating materials design hypotheses via large language models

Mining experimental data from Materials Science literature with Large Language Models: an evaluation study

14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon

LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation

LLMatDesign: Autonomous Materials Discovery with Large Language Models

Large language models for science and medicine

How should the advent of large language models affect the practice of science?