Enhancing material property prediction with ensemble deep graph convolutional networks

Chowdhury Mohammad Abid Rahman,Ghadendra Bhandari,Nasser M Nasrabadi,Aldo H. Romero,Prashnna K. Gyawali
2024-07-27
Abstract:Machine learning (ML) models have emerged as powerful tools for accelerating materials discovery and design by enabling accurate predictions of properties from compositional and structural data. These capabilities are vital for developing advanced technologies across fields such as energy, electronics, and biomedicine, potentially reducing the time and resources needed for new material exploration and promoting rapid innovation cycles. Recent efforts have focused on employing advanced ML algorithms, including deep learning - based graph neural network, for property prediction. Additionally, ensemble models have proven to enhance the generalizability and robustness of ML and DL. However, the use of such ensemble strategies in deep graph networks for material property prediction remains underexplored. Our research provides an in-depth evaluation of ensemble strategies in deep learning - based graph neural network, specifically targeting material property prediction tasks. By testing the Crystal Graph Convolutional Neural Network (CGCNN) and its multitask version, MT-CGCNN, we demonstrated that ensemble techniques, especially prediction averaging, substantially improve precision beyond traditional metrics for key properties like formation energy per atom ($\Delta E^{f}$), band gap ($E_{g}$) and density ($\rho$) in 33,990 stable inorganic materials. These findings support the broader application of ensemble methods to enhance predictive accuracy in the field.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily aims to address the challenges in material property prediction, particularly by leveraging deep learning techniques to improve prediction accuracy and model generalization. Specifically, the research team focuses on how to enhance graph neural network (GNN)-based material property prediction models through ensemble methods. The core issues addressed in the paper include: 1. **Challenges in Material Property Prediction**: The intrinsic correlation between the complex structures and properties of materials makes it difficult for machine learning models to accurately encode relevant structural information. Additionally, the wide range of variations in crystal structures poses challenges for representing model input data. 2. **Application of Graph Neural Networks**: Graph Neural Networks (GNNs) can effectively handle the inherent graph-structured data in material science, such as atomic and molecular structures. However, despite the current deep learning models' ability to integrate complex structural, geometric, and topological features to predict material properties, a comprehensive exploration of model training dynamics is still lacking. 3. **Potential of Ensemble Learning**: The paper hypothesizes that optimal model performance may not be limited to a single point of minimum validation loss but could be distributed across multiple regions of the loss landscape. Therefore, the research team focuses on exploring models at different training stages and proposes an ensemble strategy to create a unified ensemble model, aiming for more accurate prediction results and a deeper understanding of material properties. The main contributions of the paper can be summarized as follows: 1. **Introduction of Ensemble Techniques**: Applying ensemble techniques to well-known graph neural network methods, such as the Crystal Graph Convolutional Neural Network (CGCNN) and its multi-task variant MT-CGCNN, to enhance their capabilities in material property prediction. 2. **Experimental Validation**: Conducting comprehensive experiments with detailed evaluations on three widely studied material properties—formation energy per atom, band gap, and density. 3. **Comprehensive Evaluation**: Assessing the impact of the ensemble model across the entire spectrum of material properties, emphasizing the effectiveness of ensemble methods under extreme testing conditions. In summary, this paper aims to improve the accuracy of material property prediction by integrating models from multiple training stages and demonstrates the effectiveness of this approach.