Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties

Tian Xie,Jeffrey C. Grossman
DOI: https://doi.org/10.1103/PhysRevLett.120.145301
2018-04-07
Abstract:The use of machine learning methods for accelerating the design of crystalline materials usually requires manually constructed feature vectors or complex transformation of atom coordinates to input the crystal structure, which either constrains the model to certain crystal types or makes it difficult to provide chemical insights. Here, we develop a crystal graph convolutional neural networks framework to directly learn material properties from the connection of atoms in the crystal, providing a universal and interpretable representation of crystalline materials. Our method provides a highly accurate prediction of density functional theory calculated properties for eight different properties of crystals with various structure types and compositions after being trained with $10^4$ data points. Further, our framework is interpretable because one can extract the contributions from local chemical environments to global properties. Using an example of perovskites, we show how this information can be utilized to discover empirical rules for materials design.
Materials Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to accurately and interpretably predict material properties when accelerating the design of crystalline materials. Specifically, traditional methods usually require manually constructing feature vectors or performing complex transformations on atomic coordinates for inputting crystal structures, which either restricts the model to be applicable only to specific types of crystals or makes it difficult to provide chemical insights. Therefore, the authors developed a framework based on Crystal Graph Convolutional Neural Networks (CGCNN), aiming to directly learn material properties from the connection relationships of atoms in crystals and provide a general and interpretable representation method for crystalline materials. ### Main Problems and Solutions 1. **Problem Description**: - **Manually Constructing Feature Vectors or Complex Transformations**: When dealing with crystalline materials, traditional machine - learning methods usually need to manually construct fixed - length feature vectors or design symmetric - invariant atomic coordinate transformations. The former requires case - by - case design for different properties, and the latter makes it difficult to interpret the model. - **Limited Applicability**: These methods are either restricted to specific types of crystals or difficult to be extended to other types. 2. **Solution**: - **CGCNN Framework**: By constructing a crystal graph, in which nodes represent atoms and edges represent the connection relationships between atoms, features are directly extracted from the crystal structure. - **Automatically Learning Features**: Use a Convolutional Neural Network (CNN) to automatically learn the optimal feature representation from the crystal graph without the need for manual feature design. - **Interpretability**: Provide chemical insights by analyzing the contribution of the local chemical environment to the global properties, making the model interpretable. ### Specific Implementation - **Crystal Graph Representation**: Convert the crystal structure into a graph structure, where nodes represent atoms and edges represent the connections between atoms. Each node and edge is represented by a feature vector, encoding information about atoms and bonds respectively. - **Convolutional Layers and Pooling Layers**: Build a Convolutional Neural Network on the crystal graph, including convolutional layers and pooling layers, which are used to iteratively update the atomic feature vectors and generate the feature vector of the entire crystal. - **Fully - Connected Layers**: Add two fully - connected hidden layers to capture the complex mapping relationship between the crystal structure and properties. - **Output Layer**: Finally, predict the target property through the output layer. ### Advantages - **High Precision**: After training, CGCNN has achieved a precision comparable to DFT calculations in the prediction of 8 different properties. - **Generality**: Applicable to crystalline materials of various structural types and compositions. - **Interpretability**: Able to extract the contribution of the local chemical environment to the global properties, provide chemical insights, and help discover empirical rules for material design. Through these methods, CGCNN not only improves the accuracy of material property prediction but also provides valuable chemical insights, significantly reducing the search space for high - throughput screening.