Predicting materials properties without crystal structure: Deep representation learning from stoichiometry

Rhys E. A. Goodall,Alpha A. Lee
DOI: https://doi.org/10.1038/s41467-020-19964-7
2020-09-24
Abstract:Machine learning has the potential to accelerate materials discovery by accurately predicting materials properties at a low computational cost. However, the model inputs remain a key stumbling block. Current methods typically use descriptors constructed from knowledge of either the full crystal structure -- therefore only applicable to materials with already characterised structures -- or structure-agnostic fixed-length representations hand-engineered from the stoichiometry. We develop a machine learning approach that takes only the stoichiometry as input and automatically learns appropriate and systematically improvable descriptors from data. Our key insight is to treat the stoichiometric formula as a dense weighted graph between elements. Compared to the state of the art for structure-agnostic methods, our approach achieves lower errors with less data.
Computational Physics,Materials Science,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the bottleneck problem of material property prediction in the field of materials science. Specifically, the authors attempt to predict material properties through machine - learning methods without relying on crystal - structure information, thereby accelerating the discovery process of new materials. #### 1. **Existing challenges** - **Limitations of high - throughput experiments and calculations**: Due to the vastness of the material space, it is infeasible to discover new materials through exhaustive experiments. Although high - throughput ab initio simulations can calculate material properties, these methods require atomic coordinates as input and are usually only applicable to materials that have been synthesized and characterized. - **High computational cost of structure prediction**: For new compounds, predicting their possible crystal structures is a global optimization problem with extremely high computational cost, which limits the application of high - throughput workflows. - **Descriptor bottleneck**: Most existing machine - learning models rely on descriptors based on crystal structures, which limits their exploration of new - type compounds. #### 2. **Core problems of the paper** The paper proposes a new machine - learning framework that uses only the stoichiometric formula as input and automatically learns appropriate and systematically improvable descriptors from the data. The key to this method is to regard the stoichiometric formula as a weighted dense graph between elements and use the message - passing neural network (MPNN) to directly learn material descriptors. #### 3. **Objectives** - **Reduce prediction error**: Compared with existing structure - independent methods, this method can achieve lower prediction error with less data. - **Improve sample efficiency**: By using training data more efficiently, reduce the need for a large amount of data. - **Uncertainty estimation**: Provide reliable uncertainty estimation through the Deep Ensemble method, making the model more credible when dealing with unknown materials. - **Transfer learning**: Use large - scale datasets (such as OQMD) to pre - train the model and then fine - tune it on small - scale experimental datasets to improve prediction performance. ### Summary The main objective of the paper is to develop a machine - learning framework that can bypass the need for crystal structures, thereby accelerating the prediction of material properties in the discovery process of new materials. By regarding the stoichiometric formula as a weighted graph and using the message - passing neural network, this method not only improves the prediction accuracy but also enhances the generalization ability and reliability of the model.