Machine learning regression model for predicting the band gap of multi-elements nonlinear optical crystals

Yaohui Yin,Ai Wang,Zhixin Sun,Chao Xin,Guangyong Jin
DOI: https://doi.org/10.1016/j.commatsci.2024.113109
IF: 3.572
2024-05-24
Computational Materials Science
Abstract:Nonlinear optical crystals (NLO) are regarded as catalytic materials for photon coupling due to their characteristics of high conversion efficiency, tunability, and ease of manipulation. We aim to find superior performing NLO crystals by precisely estimating their bandgaps. Our approach involves utilizing compositional, structural, and orbital data as input features, leveraging models based on the Random Forest regression (RFR), Extreme Gradient Boosting (XGB), and Gradient boosting regression (GBR) regression algorithms. Notably, the RFR regression model in our research exhibited exceptional predictive performance, attaining an R 2 value of 0.823, with an RMSE of 0.639 eV and an MAE of 0.470 eV per atom. Our model works well on a small dataset. Moreover, we incorporated Shapley Additive exPlanations (SHAP) analysis to elucidate the rationale behind predictions by assessing the contribution of each feature to the bandgap. To verify the reliability of our model, we selected three traditional crystals and conducted first-principles electronic structure calculations. We compared the predicted values from our model with those computed using the Heyd-Scuseria-Ernzerhof (HSE) hybrid functional, ensuring an error margin of approximately 0.5 eV, thus confirming the precision of our model. These models facilitate swift and cost-effective predictions of bandgaps in NLO crystals.
materials science, multidisciplinary
What problem does this paper attempt to address?