Developing novel computational prediction models for assessing chemical-induced neurotoxicity using naïve Bayes classifier technique

Hui Zhang,Jun Mao,Hua-Zhao Qi,Huan-Zhang Xie,Chen Shen,Chun-Tao Liu,Lan Ding
DOI: https://doi.org/10.1016/j.fct.2020.111513
IF: 4.3
2020-09-01
Food and Chemical Toxicology
Abstract:<p>Development of reliable and efficient alternative <em>in vivo</em> methods for evaluation of the chemicals with potential neurotoxicity is an urgent need in the early stages of drug design. In this investigation, the computational prediction models for drug-induced neurotoxicity were developed by using the classical naïve Bayes classifier. Eight molecular properties closely relevant to neurotoxicity were selected. Then, 110 classification models were developed with using the eight important molecular descriptors and 10 types of fingerprints with 11 different maximum diameters. Among these 110 prediction models, the prediction model (NB-03) based on eight molecular descriptors combined with ECFP_10 fingerprints showed the best prediction performance, which gave 90.5% overall prediction accuracy for the training set and 82.1% concordance for the external test set. In addition, compared to naïve Bayes classifier, the recursive partitioning classifier displayed worse predictive performance for neurotoxicity. Therefore, the established NB-03 prediction model can be used as a reliable virtual screening tool to predict neurotoxicity in the early stages of drug design. Moreover, some structure alerts for characterizing neurotoxicity were identified in this research, which could give an important guidance for the chemists in structural modification and optimization to reduce the chemicals with potential neurotoxicity.</p>
toxicology,food science & technology
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to develop reliable and efficient computational prediction models to assess the neurotoxicity caused by chemical substances. Specifically, the research aims to: 1. **Establish reliable computational prediction models**: Develop computational models for predicting whether chemical substances are neurotoxic by using the classic Naïve Bayes classifier. 2. **Improve the efficiency in the early drug - design stage**: In the early stage of drug design, quickly and accurately identify and predict chemical substances that may cause neurotoxicity, thereby reducing the use of experimental animals and costs. 3. **Provide guidance for structural optimization**: Identify structural alerts related to neurotoxicity and provide theoretical guidance for chemists in terms of structural modification and optimization to reduce the potential neurotoxicity of compounds. ### Main methods and techniques - **Dataset selection and pre - processing**: A dataset of 2,171 compounds was collected, including 1,575 neurotoxic compounds and 596 non - neurotoxic compounds. - **Selection of molecular descriptors**: Eight molecular descriptors closely related to neurotoxicity were selected through Cramer's V coefficient and t - test, including the number of carbon atoms (C_Count), ALogP, logD, molecular weight (Molecular_Weight), the number of hydrogen - bond acceptors (Num_H_Acceptors), the number of rings (Num_Rings), polar surface area (Molecular_PolarSurfaceArea) and molecular surface area (Molecular_SASA). - **Molecular fingerprints**: Ten different types of molecular fingerprints (such as ECFP_10, etc.) were used, combined with molecular descriptors to construct prediction models. - **Machine - learning methods**: Two classic machine - learning methods, the Naïve Bayes classifier and the recursive partitioning classifier, were used for model training and comparison. ### Key results - **Best prediction model NB - 03**: The Naïve Bayes classifier model (NB - 03) based on eight molecular descriptors and ECFP_10 fingerprints performed best, with an overall prediction accuracy of 90.5% in the training set and a consistency of 82.1% in the external test set. - **Identification of structural alerts**: Some structural features related to neurotoxicity were also identified in the study, and this information can provide important guidance for the optimization of chemical structures. Through these efforts, this study provides a reliable virtual screening tool in the early stage of drug development, which is helpful for more efficiently assessing the neurotoxicity risk of chemical substances.