Materials Expert-Artificial Intelligence for Materials Discovery

Yanjun Liu,Milena Jovanovic,Krishnanand Mallayya,Wesley J. Maddox,Andrew Gordon Wilson,Sebastian Klemenz,Leslie M. Schoop,Eun-Ah Kim
2023-12-05
Abstract:The advent of material databases provides an unprecedented opportunity to uncover predictive descriptors for emergent material properties from vast data space. However, common reliance on high-throughput ab initio data necessarily inherits limitations of such data: mismatch with experiments. On the other hand, experimental decisions are often guided by an expert's intuition honed from experiences that are rarely articulated. We propose using machine learning to "bottle" such operational intuition into quantifiable descriptors using expertly curated measurement-based data. We introduce "Materials Expert-Artificial Intelligence" (ME-AI) to encapsulate and articulate this human intuition. As a first step towards such a program, we focus on the topological semimetal (TSM) among square-net materials as the property inspired by the expert-identified descriptor based on structural information: the tolerance factor. We start by curating a dataset encompassing 12 primary features of 879 square-net materials, using experimental data whenever possible. We then use Dirichlet-based Gaussian process regression using a specialized kernel to reveal composite descriptors for square-net topological semimetals. The ME-AI learned descriptors independently reproduce expert intuition and expand upon it. Specifically, new descriptors point to hypervalency as a critical chemical feature predicting TSM within square-net compounds. Our success with a carefully defined problem points to the "machine bottling human insight" approach as promising for machine learning-aided material discovery.
Materials Science,Strongly Correlated Electrons,Machine Learning,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?
This paper aims to solve a specific and challenging problem in materials science, that is, how to discover effective descriptors capable of predicting specific material properties from a large amount of data. Specifically, the paper focuses on the prediction problem of topological semimetals (TSMs) in compounds with a centered square - net structure. Traditionally, the identification of such materials requires detailed material symmetry analysis, which is both time - consuming and complex. Therefore, the paper proposes a method named "Materials Expert - Artificial Intelligence" (ME - AI), which uses machine - learning techniques to extract expert intuition from experimental data and transform it into quantifiable descriptors to predict TSMs more efficiently. The key points of the paper include: 1. **Problem Background**: The current establishment of material databases provides an unprecedented opportunity to discover predictive descriptors of material properties from big data. However, relying on high - throughput first - principles data has the problem of mismatch with experimental results, and experimental decisions are often based on ineffable expert intuition. 2. **Research Method**: The paper introduces ME - AI, which extracts composite descriptors from a carefully curated measurement dataset through Dirichlet - basis Gaussian process regression using a specially designed kernel function. 3. **Specific Application**: As a preliminary attempt, the paper selects topological semimetals with a centered square - net structure as the research object and constructs a dataset of 879 square - net materials using 12 main features. 4. **Achievements and Significance**: ME - AI not only reproduces expert intuition but also discovers new descriptors, especially indicating that hypervalency is a key chemical feature for predicting TSMs in square - net compounds. These newly discovered descriptors are helpful for better understanding the electronic properties of materials and providing guidance for future material discovery. In conclusion, this paper successfully extracts effective material descriptors from experimental data by combining expert knowledge and machine - learning techniques, providing new ideas and tools for accelerating the discovery of new materials.