QSPR Study of N‐octanol/water Partition Coefficient of Some Aromatic Compounds Using Support Vector Regression

Shan-Sheng Yang,Wen-Cong Lu,Tian-Hong Gu,Liu-Ming Yan,Guo-Zheng Li
DOI: https://doi.org/10.1002/qsar.200810025
2009-01-01
QSAR & Combinatorial Science
Abstract:Quantitative Structure-Property Relationship (QSPR) model was developed to correlate structures of aromatic compounds with their n-octanol-water partition coefficient (logK(ow)). The 68 molecular descriptors derived solely from the structures of the aromatic compounds were calculated using Gaussian 03, HyperChem 7.5, and TSAR V3.3. The descriptors were screened by the minimum Redundancy Maximum Relevance (mRMR)-Genetic Algorithm (GA)-Support Vector Regression (SVR) method. The parameters of the SVR model were optimized using the five-fold cross-validation method. The QSPR model was developed from a training set consisting of 300 compounds using the SVR method with a good determination coefficient (R-2=0.85). The QSPR model was then tested using an external test set consisting of 50 compounds with satisfactory external predictive ability (q(2)=0.84). The results show that mRMR-GA-SVR feature selection method and SVR method can be used to model logK(ow) for a diverse set of aromatic compounds and could be promising tools in the field of QSPR research.
What problem does this paper attempt to address?