Prediction of the acute toxicity of chemical compounds to the fathead minnow by machine learning approaches

Ning-Xin Tan,Ping Li,Han-Bing Rao,Ze-Rong Li,Xiang-Yuan Li
DOI: https://doi.org/10.1016/j.chemolab.2009.11.002
IF: 4.175
2010-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:Support vector machines (SVM) and artificial neural networks (ANN) are applied for prediction of the acute toxicity of compounds to fathead minnow from molecular structure. A diverse set of 611 compounds, including 442 fathead minnow toxicity (FMT) agents and 169 non-FMT agents, are adopted to develop the classification models. A hybrid feature selection method, which combines Fischer's score and Monte Carlo simulated annealing embedded in the SVM approach, is used to select the relevant descriptors from 1559 molecular descriptors. Five-fold cross-validation method is used to optimize the model parameters and select the relevant descriptors. Using the 60 selected descriptors, SVM model gives an averaged prediction accuracy of 95.5% for FMT, 79.3% for non-FMT and 91.0% for all samples, while the corresponding values of the ANN model are 92.5%, 75.2% and 87.7%, respectively. The study indicates that the hybrid feature selection method is very efficient and the selected descriptors from the SVM approach have also a good performance for the ANN approach. A hold-out method is used to build the final classification models by using the selected descriptors and optimized model parameters from the 5-fold cross-validation. The SVM model gives an excellent prediction accuracy of 96.6% for FMT, 93.0% for non-FMT and 95.1% for all samples, while the corresponding values of the ANN model are 91.4%, 90.7% and 91.1%, respectively.
What problem does this paper attempt to address?