Machine learning in materials research: developments over the last decade and challenges for the future

Anubhav Jain

DOI: https://doi.org/10.26434/chemrxiv-2024-x6spt

2024-02-26

Abstract:The number of studies that apply machine learning (ML) to materials science has been growing at a rate of approximately 1.67 times per year over the past decade. In this review, I examine this growth in various contexts. First, I present an analysis of the most commonly used tools (software, databases, materials science methods, and ML methods) used within papers that apply ML to materials science. The analysis demonstrates that despite the growth of deep learning techniques, the use of classical machine learning is still dominant as a whole. It also demonstrates how new research can effectively build upon past research, particular in the domain of ML models trained on density functional theory calculation data. Next, I present the progression of best scores as a function of time on the matbench materials science benchmark for formation enthalpy prediction. In particular, a dramatic improvement of 7 times reduction in error is obtained when progressing from feature-based methods that use conventional ML (random forest, support vector regression, etc.) to the use of graph neural network techniques. Finally, I provide views on future challenges and opportunities, focusing on data size and complexity, extrapolation, interpretation, access, and relevance.

Chemistry

What problem does this paper attempt to address?

This paper reviews the development of machine learning (ML) applications in materials science research over the past decade and discusses the challenges ahead. The number of ML applications in materials science research has grown at an approximate annual rate of 1.67 times over the past decade. The authors analyzed the most commonly used tools (software, databases, materials science methods, and ML methods) and found that while deep learning techniques have developed rapidly, traditional machine learning still dominates. The research also demonstrated progress from feature-based methods to graph neural network techniques through the matbench materials science benchmark test, reducing errors by approximately 7 times. The paper is divided into three parts: the first part analyzes cross-referencing in different fields, demonstrating how machine learning builds on previous work; the second part quantitatively analyzes the progress of structure-property prediction performance, showcasing the rapid development of ML in materials science; the third part discusses future challenges, including data size and complexity, extrapolation, interpretability, accessibility, and relevance. The authors point out that ML research in materials science can quickly build on previous work, such as utilizing existing databases, software libraries, and materials science methods. The most commonly cited software found in the research is scikit-learn, followed by datasets used for density functional theory calculations. Graph neural networks have made significant progress in performance improvement for structure-property prediction tasks. Future challenges include the issues of data volume and complexity, requiring better datasets and techniques for handling small data; evaluation and improvement of extrapolation capabilities; enhancement of model interpretability to enhance physical insights; and accessibility and relevance issues, including the accessibility of large-scale language models and the reproducibility of scientific findings. Additionally, as model performance improves, it is important to ensure that these models serve practical scientific goals rather than solely pursuing high scores.

Machine learning in materials research: developments over the last decade and challenges for the future

Advances of Machine Learning in Materials Science: Ideas and Techniques

Opportunities and Challenges for Machine Learning in Materials Science

Data-Driven Materials Discovery and Synthesis using Machine Learning Methods

Applications of machine learning method in high-performance materials design: a review

Materials Data toward Machine Learning: Advances and Challenges

Reflections on the future of machine learning for materials research

Application of Machine Learning in Material Synthesis and Property Prediction

Applied Machine Learning for Developing Next‐Generation Functional Materials

A Review of the Application of Machine Learning and Data Mining Approaches in Continuum Materials Mechanics

Advances in Machine Learning Applications in Material Science: From Non-2D to 2D Materials

Investigating the use of Deep Learning, in Materials Research for Predicting Material Properties, Identifying new Materials, and Optimizing Material Selection for Mechanical Components

Big Data Creates New Opportunities for Materials Research: A Review on Methods and Applications of Machine Learning for Materials Design

A Critical Review of Machine Learning of Energy Materials

Recent advances and applications of deep learning methods in materials science

Machine learning and artificial neural network accelerated computational discoveries in materials science

Application of machine learning for advanced material prediction and design

Deep dive into machine learning density functional theory for materials science and chemistry

Machine learning in materials genome initiative: A review