Improved Physics-Based Structural Descriptors of Perovskite Materials Enable Higher Accuracy of Machine Learning
Changjiao Li,Hua Hao,Ben Xu,Zhonghui Shen,Enhao Zhou,Dongbing Jiang,Hanxing Liu
DOI: https://doi.org/10.1016/j.commatsci.2021.110714
IF: 3.572
2021-01-01
Computational Materials Science
Abstract:With the rapid development of computational materials science and databases, machine learning methods have achieved remarkable achievements in predicting the basic performance of perovskite materials. However, due to the unique structural diversity and compositional flexibility of perovskites, it remains a challenge to construct a comprehensive feature set containing structural characteristics, which limits the prediction accuracy of machine learning and may miss some underlying physical information. In this work, a comprehensive feature set of perovskites was established based on chemical composition and physical structure. Here, five descriptors with explicit physical meaning were proposed as features to describe the structural characteristics: ionic-radii calculated tolerance factor (tIR), bond valence vector sum (BVVS), discrepancy factor (Di), global instability index (GII), and bond valence based tolerance factor (tBV). Then, the band gap (Eg) was taken as an example to test the feature set by Support Vector Regression (SVR), Random Forest Regression (RF), Bagging Regression (bagging), and Gradient Boosting Regression (GBR). Compared with the model using element information as descriptors, the prediction accuracy of the model with structural features was significantly improved (the R2 values of SVR, RF, bagging and GBR were increased by 0.0891, 0.1078, 0.1076 and 0.134, respectively). This indicates that these physical descriptors can effectively quantify the structural characteristics of perovskites, and therefore can be extended to various studies on the complex relationship between component-structureperformance of perovskites.