Abstract:The majority of chemicals detected via nontarget liquid chromatography high-resolution mass spectrometry (HRMS) in environmental samples remain unidentified, challenging the capability of existing machine learning models to pinpoint potential endocrine disruptors (EDs). Here, we predict the activity of unidentified chemicals across 12 bioassays related to EDs within the Tox21 10K dataset. Single- and multi-output models, utilizing various machine learning algorithms and molecular fingerprint features as an input, were trained for this purpose. To evaluate the models under near real-world conditions, Monte Carlo sampling was implemented for the first time. This technique enables the use of probabilistic fingerprint features derived from the experimental HRMS data with SIRIUS+CSI:FingerID as an input for models trained on true binary fingerprint features. Depending on the bioassay, the lowest false-positive rate at 90% recall ranged from 0.251 (sr.mmp, mitochondrial membrane potential) to 0.824 (nr.ar, androgen receptor), which is consistent with the trends observed in the models' performances submitted for the Tox21 Data Challenge. These findings underscore the informativeness of fingerprint features that can be compiled from HRMS in predicting the endocrine-disrupting activity. Moreover, an in-depth SHapley Additive exPlanations analysis unveiled the models' ability to pinpoint structural patterns linked to the modes of action of active chemicals. Despite the superior performance of the single-output models compared to that of the multi-output models, the latter's potential cannot be disregarded for similar tasks in the field of <i>in silico</i> toxicology. This study presents a significant advancement in identifying potentially toxic chemicals within complex mixtures without unambiguous identification and effectively reducing the workload for postprocessing by up to 75% in nontarget HRMS.

In Silico Prediction of Endocrine Disrupting Chemicals Using Single-Label and Multilabel Models.

Computational Models to Predict Endocrine-Disrupting Chemical Binding with Androgen or Oestrogen Receptors.

Modeling and Insights into the Structural Characteristics of Endocrine-Disrupting Chemicals

QSAR Study of Endocrine Disrupting Chemicals

Prediction of Endocrine-Disrupting Chemicals Related to Estrogen, Androgen, and Thyroid Hormone (EAT) Modalities Using Transcriptomics Data and Machine Learning

Mechanistic in silico modeling of bisphenols to predict estrogen and glucocorticoid disrupting potentials

A Ternary Classification Using Machine Learning Methods of Distinct Estrogen Receptor Activities Within A Large Collection of Environmental Chemicals

Predicting the Androgenicity of Structurally Diverse Compounds from Molecular Structure Using Different Classifiers

Prediction of estrogen receptor binding for 58,000 chemicals using an integrated system of a tree-based model with structural alerts.

In Silico Prediction of Chemicals Binding to Aromatase with Machine Learning Methods.

Interpretable machine learning for the identification of estrogen receptor agonists, antagonists, and binders

EDC-Predictor: A Novel Strategy for Prediction of Endocrine-Disrupting Chemicals by Integrating Pharmacological and Toxicological Profiles.

ED Profiler: Machine Learning Tool for Screening Potential Endocrine-Disrupting Chemicals

Predicting the Activity of Unidentified Chemicals in Complementary Bioassays from the HRMS Data to Pinpoint Potential Endocrine Disruptors

Ternary classification models for predicting hormonal activities of chemicals via nuclear receptors

Computational Studies of Interactions Between Endocrine Disrupting Chemicals and Androgen Receptor of Different Vertebrate Species

Development of Predictive Models for Predicting Binding Affinity of Endocrine Disrupting Chemicals to Fish Sex Hormone-Binding Globulin

Activation of steroid hormone receptors: Shed light on the in silico evaluation of endocrine disrupting chemicals

Machine Learning for Investigation on Endocrine-Disrupting Chemicals with Gestational Age and Delivery Time in a Longitudinal Cohort

Assessing Estrogenic Activities of Selected Endocrine Disrupting Chemicals and Their Combination Effects

In Silico Prediction of Chemical Reproductive Toxicity Using Machine Learning