Predicting the Aquatic Toxicity Mode of Action Using Logistic Regression and Linear Discriminant Analysis

Y. Y. Ren,L. C. Zhou,L. Yang,P. Y. Liu,B. W. Zhao,H. X. Liu
DOI: https://doi.org/10.1080/1062936x.2016.1229691
IF: 3.681
2016-01-01
SAR and QSAR in Environmental Research
Abstract:The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
What problem does this paper attempt to address?