Abstract:Determining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILES feature representation method by modifying our previous 3D array representation of 1D SMILES simulated by the convolutional neural network (CNN). We developed binary classification, multiclass classification, and regression models based on diverse non-congeneric chemicals. Along with the HNN-Cancer model, we developed models based on the random forest (RF), bootstrap aggregating (Bagging), and adaptive boosting (AdaBoost) methods for binary and multiclass classification. We developed regression models using HNN-Cancer, RF, support vector regressor (SVR), gradient boosting (GB), kernel ridge (KR), decision tree with AdaBoost (DT), KNeighbors (KN), and a consensus method. The performance of the models for all classifications was assessed using various statistical metrics. The accuracy of the HNN-Cancer, RF, and Bagging models were 74%, and their AUC was ~0.81 for binary classification models developed with 7994 chemicals. The sensitivity was 79.5% and the specificity was 67.3% for the HNN-Cancer, which outperforms the other methods. In the case of multiclass classification models with 1618 chemicals, we obtained the optimal accuracy of 70% with an AUC 0.7 for HNN-Cancer, RF, Bagging, and AdaBoost, respectively. In the case of regression models, the correlation coefficient (R) was around 0.62 for HNN-Cancer and RF higher than the SVM, GB, KR, DTBoost, and NN machine learning methods. Overall, the HNN-Cancer performed better for the majority of the known carcinogen experimental datasets. Further, the predictive performance of HNN-Cancer on diverse chemicals is comparable to the literature-reported models that included similar and less diverse molecules. Our HNN-Cancer could be used in identifying potentially carcinogenic chemicals for a wide variety of chemical classes.

Prediction of molecular-specific mutagenic alerts and related mechanisms of chemicals by a convolutional neural network (CNN) model based on SMILES split

AMPred-CNN: Ames mutagenicity prediction model based on convolutional neural networks

Stacked ensemble\-based mutagenicity prediction model using multiple modalities with graph attention network

Advancing Adverse Drug Reaction Prediction with Deep Chemical Language Model for Drug Safety Evaluation

Deep active learning with high structural discriminability for molecular mutagenicity prediction

Molecular image-convolutional neural network (CNN) assisted QSAR models for predicting contaminant reactivity toward OH radicals: Transfer learning, data augmentation and model interpretation

A deep learning based multi-model approach for predicting drug-like chemical compound's toxicity

Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method

A Novel Fault Detection and Identification Method for Complex Chemical Processes Based on OSCAE and CNN

Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network

Toxicity Detection in Drug Candidates using Simplified Molecular-Input Line-Entry System

A Deep Learning-Based Chemical System for QSAR Prediction

ADMET Evaluation in Drug Discovery. 18. Reliable Prediction of Chemical-Induced Urinary Tract Toxicity by Boosting Machine Learning Approaches

Using Support Vector Regression Coupled with the Genetic Algorithm for Predicting Acute Toxicity to the Fathead Minnow

In silico prediction of ocular toxicity of compounds using explainable machine learning and deep learning approaches

Accurate Clinical Toxicity Prediction using Multi-task Deep Neural Nets and Contrastive Molecular Explanations

Unlocking the potential of AI: Machine learning and deep learning models for predicting carcinogenicity of chemicals

MSDSE: Predicting drug-side effects based on multi-scale features and deep multi-structure neural network

In Silico Prediction of Chemical Acute Dermal Toxicity Using Explainable Machine Learning Methods

DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction

In Silico Prediction Of Chemical Aquatic Toxicity With Chemical Category Approaches And Substructural Alerts