ToxMPNN: A Deep Learning Model for Small Molecule Toxicity Prediction.

Yini Zhou,Chao Ning,Yijun Tan,Yaqi Li,Jiaxu Wang,Yuanyuan Shu,Songping Liang,Zhonghua Liu,Ying Wang
DOI: https://doi.org/10.1002/jat.4591
IF: 3.3
2024-01-01
Journal of Applied Toxicology
Abstract:Machine learning (ML) has shown a great promise in predicting toxicity of small molecules. However, the availability of data for such predictions is often limited. Because of the unsatisfactory performance of models trained on a single toxicity endpoint, we collected toxic small molecules with multiple toxicity endpoints from previous study. The dataset comprises 27 toxic endpoints categorized into seven toxicity classes, namely, carcinogenicity and mutagenicity, acute oral toxicity, respiratory toxicity, irritation and corrosion, cardiotoxicity, CYP450, and endocrine disruption. In addition, a binary classification Common-Toxicity task was added based on the aforementioned dataset. To improve the performance of the models, we added marketed drugs as negative samples. This study presents a toxicity predictive model, ToxMPNN, based on the message passing neural network (MPNN) architecture, aiming to predict the toxicity of small molecules. The results demonstrate that ToxMPNN outperforms other models in capturing toxic features within the molecular structure, resulting in more precise predictions with the ROC_AUC testing score of 0.886 for the Toxicity_drug dataset. Furthermore, it was observed that adding marketed drugs as negative samples not only improves the predictive performance of the binary classification Common-Toxicity task but also enhances the stability of the model prediction. It shows that the graph-based deep learning (DL) algorithms in this study can be used as a trustworthy and effective tool to assess small molecule toxicity in the development of new drugs. Machine learning has shown great promise in predicting toxicity of small molecules. This study presents a toxicity predictive model, ToxMPNN, based on the message passing neural network architecture, aiming to predict the toxicity of small molecules. ToxMPNN gives precise predictions with the ROC_AUC testing score of 0.886 for the Toxicity_drug dataset, which contains 27 toxic endpoints in seven toxicity categories and a binary classification Common-Toxicity task, and can be used as an effective tool to predict the toxicity of small molecules.
What problem does this paper attempt to address?