A comprehensive prediction system for silkworm acute toxicity assessment of environmental and in-silico pesticides
Yutong Liu,Yue Yu,Bing Wu,Jieshu Qian,Hongxin Mu,Luyao Gu,Rong Zhou,Houhu Zhang,Hua Wu,Yuanqing Bu
DOI: https://doi.org/10.1016/j.ecoenv.2024.116759
2024-09-01
Abstract:The excessive application and loss of pesticides poses a great risk to the ecosystem, and the environmental safety assessment of pesticides is time-consuming and expensive using traditional animal toxicity tests. In this work, a pesticide acute toxicity dataset was created for silkworm integrating extensive experiments and various common pesticide formulations considering the sensitivity of silkworm to adverse environment, its economic value in China, and a gap in machine learning (ML) research on the toxicity prediction of this species, which addressed the previous limitation of only being able to predict toxicity classification without specific toxicity values. A new comprehensive voting model (CVR) was developed based on ML, combined with three regression algorithms, namely, Bayesian Ridge (BR), K Neighbors Regressor (KNN), Random Forest Regressor (RF) to accurately calculate lethal concentration 50 % (LC50). Three conformal models were successfully constructed, marking the first combination of conformal models with confidence intervals to predict silkworm toxicity. Further, the mechanism by analyzing structural alerts was summarized, and identified 25 warning structures, 24 positive compounds and 14 negative compounds. Importantly, a novel comprehensive prediction system was constructed that can provide LC50 and confidence intervals, structural alerts analysis, lipid-water partition coefficient (LogP) and similarity analysis, which can comprehensively evaluate the ecological toxicity risk of substances to make up for the incomplete toxicity data of new pesticides. The validity and generalization of the CVR model were verified by an external validation set. In addition, five new, low-toxic and green pesticide alternatives were designed through 50,000 cycles. Moreover, our software and ST Profiler can provide low-cost information access to accelerate environmental risk assessment, which can predict not only a single chemical, but also batches of chemicals, simply by inputting the SMILES / CAS / (Chinese / English) name of chemicals.