Leveraging Large-scale Computational Database and Deep Learning for Accurate Prediction of Material Properties

Pin Chen,Jianwen Chen,Hui Yan,Qing Mo,Zexin Xu,Jinyu Liu,Wenqing Zhang,Yuedong Yang,Yutong Lu
DOI: https://doi.org/10.48550/arXiv.2112.14429
2021-12-29
Materials Science
Abstract:Accurately predicting the physical and chemical properties of materials remains one of the most challenging tasks in material design, and one effective strategy is to construct a reliable data set and use it for training a machine learning model. In this study, we constructed a large-scale material genome database (Matgen) containing 76,463 materials collected from experimentally-observed database, and computed their bandgap properties through the Density functional theory (DFT) method with Perdew-Burke-Ernzehof (PBE) functional. We verified the computation method by comparing part of our results with those from the open Material Project (MP) and Open Quantum Materials Database (OQMD), all with PBE computations, and found that Matgen achieved the same computation accuracy based on both measured and computed bandgap properties. Based on the computed properties of our comprehensive dataset, we have developed a new graph-based deep learning model, namely CrystalNet, through our recently developed Communicative Message Passing Neural Network (CMPNN) framework. The model was shown to outperform other state-of-the-art prediction models. A further fine-tuning on 1716 experimental bandgap values (CrystalNet-TL) achieved a superior performance with mean absolute error (MAE) of 0.77 eV on independent test, which has outperformed the pure PBE (1.14~1.45 eV). Moreover, the model was proven applicable to hypothetical materials with MAE of 0.77 eV as referred by computations from HSE, a highly accurate quantum mechanics (QM) method, consist better than PBE (MAE=1.13eV). We also made material structures, computed properties by PBE, and the CrystalNet models publically available at https://matgen.nscc-gz.cn.
What problem does this paper attempt to address?