Abstract:A quantitative structure–activity relationship (QSAR) study was conducted on 313 pesticides to predict their acute toxicity to Sheepshead minnow ( Cyprinodon variegatus ) by using DRAGON descriptors. Essentials accounting for a reliable model were all considered carefully, giving full consideration to the OECD (Organization for Economic Co-operation and Development) principles for QSAR acceptability in regulation during the model construction and assessment process. Nine variables were selected through the forward stepwise regression method and used as inputs to construct both linear and nonlinear models. The obtained models were validated internally and externally. Generally, machine learning-based methods, namely support vector machine (SVM), random forest (RF), and projection pursuit regression (PPR), perform better than the multiple linear regression (MLR) model. The statistical results ( R 2 = 0.682–0.933, Q 2 LOO = 0.604–0.659, Q 2 F1 = 0.740–0.796, CCC = 0.861–0.882) of the developed models show that they are robust, reliable, reproducible, accurate and predictive. Comparatively, the RF model performs best, giving predictive correlation coefficient Q 2 of 0.814, root mean squared error ( RMSE ) of 0.658 and mean absolute error ( MAE ) of 0.534 for the test set, respectively. The RF model (as well as SVM and PPR models) was visualized and explained by using the SHapley Additive explanation (SHAP) analysis to enhance its transparency and credibility. In addition, the applicability domain (AD) range of the RF model was characterized by the Williams plot and the tree manifold approximation and projection (TMAP) technology was utilized to illustrate similarity and diversity of the entire data space, to assist in the analysis of the outliers. Activity cliff detection was investigated by using Arithmetic Residuals in K-groups Analysis (ARKA) descriptors. It was found that none of the pesticides was identified as an activity cliff in the training set or a potential prediction cliff in the test set. Therefore, the RF model fulfills each OECD principle in regulation for QSAR models. The research in this work will aid in the in silico QSAR prediction of the acute toxicity to Sheepshead minnow ( Cyprinodon variegatus ) for untested and new toxic pesticides and can also be extended to other studies.

Understanding the Aquatic Toxicity of Pesticide: Structure-Activity Relationship and Molecular Descriptors to Distinguish the Ratings of Toxicity

Prediction of the Aquatic Toxicity of Aromatic Compounds to Tetrahymena Pyriformis Through Support Vector Regression

Structure‐activity Relationship Approaches and Applications

Prediction of the Aquatic Toxicity of Phenols to Tetrahymena Pyriformis from Molecular Descriptors

QSAR Models for the Acute Toxicity of 1,2,4-Triazole Fungicides to Zebrafish (danio Rerio) Embryos.

QSAR Study on Toxicity of Chemical Components of Chinese Materia Medica and Acute Toxicity of Rats

Using Support Vector Regression Coupled with the Genetic Algorithm for Predicting Acute Toxicity to the Fathead Minnow

QSAR study of toxicity on fish based on molecular descriptors

Regression Quantitative Structure-toxicity Relationship of Pesticides on Fishes

Quantitative structure–activity relationship predicting toxicity of pesticides towards Daphnia magna

Quantitative Structure-Activity Relationship Between the Toxicity of Amine Surfactant and Its Molecular Structure

Accurate Prediction of Aquatic Toxicity of Aromatic Compounds Based on Genetic Algorithm and Least Squares Support Vector Machines

QSAR assessment of aquatic toxicity potential of diverse agrochemicals

In Silico Prediction of Pesticide Aquatic Toxicity with Chemical Category Approaches.

Insights into pesticide toxicity against aquatic organism: QSTR models on Daphnia Magna.

Model for Toxicity of Substituted Aromatic Compounds to Aquatic Organisms

Qsar Models for Predicting Toxicity of Polychlorinated Dibenzo-P-Dioxins and Dibenzofurans Using Quantum Chemical Descriptors

Explainable machine learning models for predicting the acute toxicity of pesticides to sheepshead minnow ( Cyprinodon variegatus )

Qsar Study of the Acute Toxicity to Fathead Minnow Based on A Large Dataset

From molecular descriptors to the developmental toxicity prediction of pesticides/veterinary drugs/bio-pesticides against zebrafish embryo: Dual computational toxicological approaches for prioritization

QSAR Studies on the Acute Toxicity of Aliphatic Compounds Based on the Supporting Vector Machines