Abstract:In silico methods are essential to the safety evaluation of chemicals. Computational risk assessment offers several approaches, with data science and knowledge-based methods becoming an increasingly important sub-group. One of the substantial attributes of data science is that it allows using existing data to find correlations, build strong hypotheses, and create new, valuable knowledge that may help to reduce the number of resource intensive experiments. In choosing a suitable method for toxicity prediction, the available data and desired toxicity endpoint are two essential factors to consider. The complexity of the endpoint can impact the success rate of the in silico models. For highly complex endpoints such as hepatotoxicity, it can be beneficial to decipher the toxic event from a more systemic point of view. We propose a data science-based modelling pipeline that uses compounds` connections to tissue-specific biological targets, interactome, and biological pathways as descriptors of compounds. Models trained on different combinations of the collected, compound-target, compound-interactor, and compound-pathway profiles, were used to predict the hepatotoxicity of drug-like compounds. Several tree-based models were trained, utilizing separate and combined target, interactome and pathway level variables. The model using combined descriptors of all levels and the random forest algorithm was further optimized. Descriptor importance for model performance was addressed and examined for a biological explanation to define which targets or pathways can have a crucial role in toxicity. Descriptors connected to cytochromes P450 enzymes, heme degradation and biological oxidation received high weights. Furthermore, the involvement of other, less discussed processes in connection with toxicity, such as the involvement of RHO GTPase effectors in hepatotoxicity, were marked as fundamental. The optimized combined model using only the selected descriptors yielded the best performance with an accuracy of 0.766. The same dataset using classical Morgan fingerprints for compound representation yielded models with similar performance measures, as well as the combination of systems biology-based descriptors and Morgan fingerprints. Consequently, adding the structural information of compounds did not enhance the predictive value of the models. The developed systems biology-based pipeline comprises a valuable tool in predicting toxicity, while providing novel insights about the possible mechanisms of the unwanted events.

Combining human cell line transcriptome analysis and Bayesian inference to build trustworthy machine learning models for prediction of animal toxicity in drug development

Identifying Protein Features and Pathways Responsible for Toxicity Using Machine Learning and Tox21: Implications for Predictive Toxicology

DIVERSE: Bayesian Data IntegratiVE learning for precise drug ResponSE prediction

Drug Toxicity Prediction by Machine Learning Approaches

In silico prediction of drug-induced developmental toxicity by using machine learning approaches

Personalised Medicine: Establishing predictive machine learning models for drug responses in patient derived cell culture

Predictive Systems Toxicology

A deep learning based multi-model approach for predicting drug-like chemical compound's toxicity

A Review on the Recent Applications of Deep Learning in Predictive Drug Toxicological Studies.

Predicting non-chemotherapy drug-induced agranulocytosis toxicity through ensemble machine learning approaches

Accurate Clinical Toxicity Prediction using Multi-task Deep Neural Nets and Contrastive Molecular Explanations

AI-driven Discovery of Morphomolecular Signatures in Toxicology

Identification of Optimal Machine Learning Algorithms and Molecular Fingerprints for Explainable Toxicity Prediction Models Using ToxCast/Tox21 Bioassay Data

MolToxPred: small molecule toxicity prediction using machine learning approach

Review of machine learning and deep learning models for toxicity prediction

Machine Learning Prediction of On/Off Target-driven Clinical Adverse Events

Toxicity prediction using target, interactome, and pathway profiles as descriptors

Model ensembling as a tool to form interpretable multi-omic predictors of cancer pharmacosensitivity

How to Predict Effective Drug Combinations - Moving beyond Synergy Scores

SYSTEMATIC ASSESSMENT OF ANALYTICAL METHODS FOR DRUG SENSITIVITY PREDICTION FROM CANCER CELL LINE DATA

Deep Learning-based Modeling for Preclinical Drug Safety Assessment