Abstract:This study addresses the challenge of assessing the carcinogenic potential of hazardous chemical mixtures, such as per- and polyfluorinated substances (PFASs), which are known to contribute significantly to cancer development. Here, we propose a novel framework called HNNMixCancer that utilizes a hybrid neural network (HNN) integrated into a machine-learning framework. This framework incorporates a mathematical model to simulate chemical mixtures, enabling the creation of classification models for binary (carcinogenic or noncarcinogenic) and multiclass classification (categorical carcinogenicity) and regression (carcinogenic potency). Through extensive experimentation, we demonstrate that our HNN model outperforms other methodologies, including random forest, bootstrap aggregating, adaptive boosting, support vector regressor, gradient boosting, kernel ridge, decision tree with AdaBoost, and KNeighbors, achieving a superior accuracy of 92.7% in binary classification. To address the limited availability of experimental data and enrich the training data, we generate an assumption-based virtual library of chemical mixtures using a known carcinogenic and noncarcinogenic single chemical for all the classification models. Remarkably, in this case, all methods achieve accuracies exceeding 98% for binary classification. In external validation tests, our HNN method achieves the highest accuracy of 80.5%. Furthermore, in multiclass classification, the HNN demonstrates an overall accuracy of 96.3%, outperforming RF, Bagging, and AdaBoost, which achieved 91.4%, 91.7%, and 80.2%, respectively. In regression models, HNN, RF, SVR, GB, KR, DT with AdaBoost, and KN achieved average R2 values of 0.96, 0.90, 0.77, 0.94, 0.96, 0.96, and 0.97, respectively, showcasing their effectiveness in predicting the concentration at which a chemical mixture becomes carcinogenic. Our method exhibits exceptional predictive power in prioritizing carcinogenic chemical mixtures, even when relying on assumption-based mixtures. This capability is particularly valuable for toxicology studies that lack experimental data on the carcinogenicity and toxicity of chemical mixtures. To our knowledge, this study introduces the first method for predicting the carcinogenic potential of chemical mixtures. The HNNMixCancer framework offers a novel alternative for dose-dependent carcinogen prediction. Ongoing efforts involve implementing the HNN method to predict mixture toxicity and expanding the application of HNNMixCancer to include multiple mixtures such as PFAS mixtures and co-occurring chemicals.

Prediction of Chemical Carcinogenicity by Machine Learning Approaches.

Predicting the Androgenicity of Structurally Diverse Compounds from Molecular Structure Using Different Classifiers

Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method

QSAR study on the toxicity of chemical components of Chinese materia medica causing the carcinogenicity of rats

Unlocking the potential of AI: Machine learning and deep learning models for predicting carcinogenicity of chemicals

In Silico Prediction of Chemical Genotoxicity Using Machine Learning Methods and Structural Alerts.

In Silico Estimation of Chemical Carcinogenicity with Binary and Ternary Classification Methods

CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods

In Silico Prediction of Chemical Ames Mutagenicity

In Silico Prediction of Chemical Reproductive Toxicity Using Machine Learning

Predicting Dose-Dependent Carcinogenicity of Chemical Mixtures Using a Novel Hybrid Neural Network Framework and Mathematical Approach

ADMET Evaluation in Drug Discovery. 18. Reliable Prediction of Chemical-Induced Urinary Tract Toxicity by Boosting Machine Learning Approaches

Prediction of genotoxicity of chemical compounds by statistical learning methods.

MLASM: Machine learning based prediction of anticancer small molecules

Use of support vector machine to predict the toxicity of aromatic compounds

Classification of the Carcinogenicity of N-nitroso Compounds Based on Support Vector Machines and Linear Discriminant Analysis.

An Explainable Supervised Machine Learning Model for Predicting Respiratory Toxicity of Chemicals Using Optimal Molecular Descriptors

Prediction of the acute toxicity of chemical compounds to the fathead minnow by machine learning approaches

SVM Model for Predicting Carcinogenicity of Polycyclic Aromatic Hydrocarbons and Derivatives

Predicting Organ Toxicity Using in Vitro Bioactivity Data and Chemical Structure.

Application of support vector machine (SVM) for prediction toxic activity of different data sets