Prediction and Application in QSPR of Aqueous Solubility of Sulfur-Containing Aromatic Esters Using GA-based MLR with Quantum Descriptors
CS Yin,XH Liu,WM Guo,T Lin,XD Wang,LS Wang
DOI: https://doi.org/10.1016/s0043-1354(01)00532-2
IF: 12.8
2002-01-01
Water Research
Abstract:Quantitative structure–property relationships (QSPR) were developed using a genetic algorithm (GA)-based variable-selection approach with quantum chemical descriptors derived from AM1-based calculations (MOPAC7.0). With the QSPR models, the aqueous solubility of 71 aromatic sulfur-containing carboxylates, including phenylthio, and phenylsulfonyl carboxylates were efficiently estimated and predicted. Using GA-based multivariate linear regression (MLR) with cross-validation procedure, the most important descriptors were selected from a pool of 28 quantum chemical semi-empirical descriptors, including steric and electronic types, to build QSPR models. The molecular descriptors included molecular surface (SA), charges on carboxyl group (QOC), the magnitude of the difference between EHOMO of the solute and ELUMO of water, divided by 100 (EB), which were main factors affecting the aqueous solubility of the compounds of interest. The resulted coefficients R and R2 of 0.9571 and 0.9161 and the prediction residual error sum of squares (PRESS) of 13.1768, revealed that it was accurate and reliable for the model to predict the aqueous solubility of the investigated organic compounds. If two outliers were omitted from the dataset, the resulted coefficients R=0.9619, R2=0.9253, and PRESS=10.3875 were significantly improved. Compared with stepwise regression analysis, the results obtained in this work were better and more reasonable. The best QSPR model were obtained by GA-based MLR. Reasonable mechanisms for aqueous solubility of the sulfur-containing carboxylates were investigated and interpreted.