Abstract:Predicting material properties has always been a challenging task in materials science. With the emergence of machine learning methodologies, new avenues have opened up. In this study, we build upon our recently developed Graph Neural Network (GNN) approach to construct models that predict four distinct material properties. Our graph model represents materials as element graphs, with chemical formula serving as the only input. This approach ensures permutation invariance, offering a robust solution to prior limitations. By employing bootstrap methods to train on this individual GNN, we further enhance the reliability and accuracy of our predictions. With multi-task learning, we harness the power of extensive datasets to boost the performance of smaller ones. We introduce the inaugural version of the Materials Properties Prediction (MAPP) framework, empowering the prediction of material properties solely based on chemical formulas.
What problem does this paper attempt to address?
The paper aims to address the challenging problem of predicting material properties in materials science, particularly how to efficiently and accurately predict various physical properties of materials based on their chemical formulas. The authors have developed a framework called Materials Properties Prediction (MAPP) that utilizes Graph Neural Network (GNN) methods to build models for predicting four different material properties: bulk modulus, volume, heat of fusion, and critical temperature of superconductors. The main features of this framework include:
- **Simple Input**: Only the chemical formula is required as input, making it applicable to any material with a known chemical formula.
- **Application of Graph Neural Networks**: By representing materials as element graphs and converting chemical formulas into node feature vectors, effective prediction of material properties is achieved.
- **Multi-task Learning**: Enhances the performance of smaller datasets by leveraging larger datasets, such as improving the accuracy of heat of fusion predictions through joint training with the melting point dataset.
- **Ensemble Models and Uncertainty Quantification**: Uses ensemble models to improve the robustness and accuracy of predictions and can estimate the uncertainty of predictions to detect outliers.
Specifically, the paper discusses the prediction of the following material properties:
- **Bulk Modulus**: The model achieved an R² score of 0.95 on the test set, with an RMSE of 17.04 GPa and an MAE of 9.96 GPa. The ensemble model further improved these metrics.
- **Volume**: The model performed excellently, achieving an R² score of 0.97 on the test set, with an RMSE of 1.56 ų and an MAE of 0.65 ų.
- **Critical Temperature of Superconductors**: The model showed outstanding performance in predicting high critical temperature superconductors, with an R² score of 0.91 on the test set, an RMSE of 10.16 K, and an MAE of 6.91 K.
- **Heat of Fusion**: Despite the smaller dataset, the model's R² score on the test set improved from 0.70 to 0.74 after training with the melting point dataset through multi-task learning, with an RMSE of 1.01 kcal/mol and an MAE of 0.67 kcal/mol.
Through these methods, the MAPP framework demonstrates its potential in predicting material properties, promising to accelerate the design and discovery process of new materials.