[Research on QSPR for N-Octanol-water Partition Coefficients of Organic Compounds Based on Genetic Algorithms-Support Vector Machine and Genetic Algorithms-Radial Basis Function Neural Networks].

Jun Qi,Jun-Feng Niu,Li-Li Wang
DOI: https://doi.org/10.3321/j.issn:0250-3301.2008.01.036
2008-01-01
Abstract:A modified method to develop quantitative structure-property relationship (QSPR) models of organic compounds was proposed based on genetic algorithm (GA) and support vector machine (SVM) (GA-SVM). GA was used to perform the variable selection, and SVM was used to construct QSPR models. GA-SVM was applied to develop the QSPR models for n-octanol-water partition coefficients ( Kow) of 38 typical organic compounds in food industry. 5 descriptors (molecular weights, Hansen polarity, boiling point, percent oxygen and percent hydrogen) were selected in the QSPR model. The coefficient of multiple determination (R2), the sum of squares due to error (SSE) and the root mean squared error (RMSE) values between the measured values and predicted values of the model developed by GA-SVM are 0.999, 0.048 and 0.036, respectively, indicating good predictive capability for lgKow values of these organic compounds. Based on leave-one-out cross validation, the QSPR model constructed by GA-SVM showed good robustness (SSE = 0.295, RMSE = 0.089, R2 = 0.995). Moreover, the models developed by GA-SVM were compared with the models constructed by genetic algorithm-radial basis function neural network (GA-RBFNN) and linear method. The models constructed by GA-SVM show the optimal predictive capability and robustness in the comparison, which illustrates GA-SVM is the optimal method for developing QSPR models for lgKow values of these organic compounds.
What problem does this paper attempt to address?