Abstract:The diversity of hundreds of thousands of potential organic pollutants and the lack of (publicly available) information about many of them is a huge challenge for environmental sciences, engineering, and regulation. Suspect screening based on high-resolution liquid chromatography-mass spectrometry (LC-HRMS) has enormous potential to help characterize the presence of these chemicals in our environment, enabling the detection of known and newly emerging pollutants, as well as their potential transformation products (TPs). Here, suspect list creation (focusing on pesticides relevant for Luxembourg, incorporating data sources in 4 languages) was coupled to an automated retrieval of related TPs from PubChem based on high confidence suspect hits, to screen for pesticides and their TPs in Luxembourgish river samples. A computational workflow was established to combine LC-HRMS analysis and pre-screening of the suspects (including automated quality control steps), with spectral annotation to determine which pesticides and, in a second step, their related TPs may be present in the samples. The data analysis with Shinyscreen (https://gitlab.lcsb.uni.lu/eci/shinyscreen/), an open source software developed in house, coupled with custom-made scripts, revealed the presence of 162 potential pesticide masses and 96 potential TP masses in the samples. Further identification of these mass matches was performed using the open source approach MetFrag (https://msbi.ipb-halle.de/MetFrag/). Eventual target analysis of 36 suspects resulted in 31 pesticides and TPs confirmed at Level-1 (highest confidence), and five pesticides and TPs not confirmed due to different retention times. Spatio-temporal analysis of the results showed that TPs and pesticides followed similar trends, with a maximum number of potential detections in July. The highest detections were in the rivers Alzette and Mess and the lowest in the Sûre and Eisch. This study (a) added pesticides, classification information and related TPs into the open domain, (b) developed automated open source retrieval methods - both enhancing FAIRness (Findability, Accessibility, Interoperability and Reusability) of the data and methods; and (c) will directly support "L'Administration de la Gestion de l'Eau" on further monitoring steps in Luxembourg.

Prioritization of unknown features based on predicted toxicity categories

An Automated Toxicity Based Prioritization Framework for Fast Chemical Characterization in Non-Targeted Analysis

Enhancing compound confidence in suspect and non-target screening through machine learning-based retention time prediction

Evaluation of Nontargeted Mass Spectral Data Acquisition Strategies for Water Analysis and Toxicity-Based Feature Prioritization by MS2Tox

Focused Natural Product Elucidation by Prioritizing High-Throughput Metabolomic Studies with Machine Learning

Integration of Transferable Prediction of Retention Index and Universal Library Search Enhances Exposome Identification Probability in RPLC/HRMS-Based Non-Targeted Analysis

Prioritization of the ecotoxicological hazard of PAHs towards aquatic species spanning three trophic levels using 2D-QSTR, read-across and machine learning-driven modelling approaches

Online and Offline Prioritization of Chemicals of Interest in Suspect Screening and Non-targeted Screening with High-Resolution Mass Spectrometry

Deep Learning Bridged Bioactivity, Structure, and GC-HRMS-Readable Evidence to Decipher Nontarget Toxicants in Sediments

Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra

Estimating LoD-s Based on the Ionization Efficiency Values for the Reporting and Harmonization of Amenable Chemical Space in Nontargeted Screening LC/ESI/HRMS

MS2Tox Machine Learning Tool for Predicting the Ecotoxicity of Unidentified Chemicals in Water by Nontarget LC-HRMS

Screening and identification of unknown chemical contaminants in food based on liquid chromatography–high-resolution mass spectrometry and machine learning

From molecular descriptors to the developmental toxicity prediction of pesticides/veterinary drugs/bio-pesticides against zebrafish embryo: Dual computational toxicological approaches for prioritization

Linking electron ionization mass spectra of organic chemicals to toxicity endpoints through machine learning and experimentation

Screening and prioritization of organic chemicals in a large river basin by suspect and non-target analysis

Predicting the Activity of Unidentified Chemicals in Complementary Bioassays from the HRMS Data to Pinpoint Potential Endocrine Disruptors

Identifying Chemicals Through Dimensionality Reduction

Non-target liquid chromatography high-resolution mass spectrometry screening to prioritize unregulated micropollutants that persist through domestic wastewater treatment

Are exposure predictions, used for the prioritization of pharmaceuticals in the environment, fit for purpose?

Discovering pesticides and their TPs in Luxembourg waters using open cheminformatics approaches