Predicting the Maximum Absorption Wavelength of Azo Dyes Using an Interpretable Machine Learning Strategy

Jiaqi Mai,Tian Lu,Pengcheng Xu,Zhengheng Lian,Minjie Li,Wencong Lu
DOI: https://doi.org/10.1016/j.dyepig.2022.110647
IF: 5.122
2022-01-01
Dyes and Pigments
Abstract:The maximum absorption wavelength (lambda(max)) is one of the most important properties of azo dyes. It is essential to obtain lambda(max) of azo dyes for the development of new molecules in a short time. Herein, the machine learning algorithm "XGBoost " was used to establish a model for predicting lambda(max )of azo dyes. It was found that the coef-ficient of determinations (R-2) of leave-one-out cross-validation (LOOCV) and test set were 0.87, 0.73, respec-tively. According to SHapley Additive exPlanations (SHAP) analysis, the number of sulfur atoms of R-2 group has a strong positive correlation with lambda(max). The more C-N pairs of topological distance 4 appear in R1 group, the more likely the molecular lambda(max )is red-shifted. Further, the high-throughput screening strategy was adopted to screen out 26 azo molecules with larger lambda(max )from nearly 20,000 virtual samples. These molecular lambda(max )are expected to be red shifted from the 610 nm in the dataset. Our study provides a convenient way to search for azo dyes with larger lambda(max).
What problem does this paper attempt to address?