In Silico Toxicity Prediction of Chemicals from EPA Toxicity Database by Kernel Fusion-Based Support Vector Machines

Dong-Sheng Cao,Jie Dong,Ning-Ning Wang,Ming Wen,Bai-Chuan Deng,Wen-Bin Zeng,Qing-Song Xu,Yi-Zeng Liang,Ai-Ping Lu,Alex F. Chen
DOI: https://doi.org/10.1016/j.chemolab.2015.07.009
IF: 4.175
2015-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:There is a great need to assess the harmful effects or toxicities of chemicals to which man is exposed. In the present paper, the kernel fusion technique, together with the state-of-the-art support vector machine (SVM) algorithm, was developed to classify the toxicity of chemicals from Distributed Structure-Searchable Toxicity (DSSTox) database network. In this method, different kernels were firstly constructed by applying different molecular fingerprint systems, including FP2, FP4 and MACCS, and then these kernels were integrated to form a new fused kernel strictly under the algorithmic framework of kernel methods. The fused kernel can accurately measure the similarities of molecules for the toxicity classification, taking advantage of the complementarity in multiple kernels and therefore improving the prediction performance. Two model validation approaches, five-fold cross-validation and independent validation set, were used for assessing the predictive capability of our developed models. The obtained results indicate that the kernel fusion-based SVM gave the best prediction ability compared to single fingerprint kernels, and therefore could be regarded as a very promising and alternative modeling approach for potential toxicity prediction of chemicals.
What problem does this paper attempt to address?