Abstract:The incorporation of data science is revolutionizing organic chemistry. It is becoming increasingly possible to predict reaction outcomes with accuracy, computationally plan new retrosynthetic routes to complex molecules, and design molecules with sophisticated functions. Critical to these developments has been statistical analysis of reaction data, for instance with machine learning, yet there is very little reaction data available upon which to build models. Reaction data can be mined from the literature, but experimental data tends to be reported in a text format that is difficult for computers to read. Compounding the issue, literature data are heavily biased toward "productive" reactions, and few "negative" reaction data points are reported even though they are critical for training of statistical models. High-throughput experimentation (HTE) has evolved over the past few decades as a tool for experimental reaction development. The beauty of HTE is that reactions are run in a systematic format, so data points are internally consistent, the reaction data are reported whether the desired product is observed or not, and automation may reduce the occurrence of false positive or negative data points. Additionally, experimental workflows for HTE lead to datasets with reaction metadata that are captured in a machine-readable format. We believe that HTE will play an increasingly important role in the data revolution of chemical synthesis. This Account details the miniaturization of synthetic chemistry culminating in ultrahigh-throughput experimentation (ultraHTE), wherein reactions are run in ∼1 μL droplets inside of 1536-well microtiter plates to minimize the use of starting materials while maximizing the output of experimental information. The performance of ultraHTE in 1536-well microtiter plates has led to an explosion of available reaction data, which have been used to identify specific substrate–catalyst pairs for maximal efficiency in novel cross-coupling reactions. The first iteration of ultraHTE focused on the use of dimethyl sulfoxide (DMSO) as a high-boiling solvent that is compatible with the plastics most commonly used in consumable well plates, which generated homogeneous reaction mixtures that are perfect for use with nanoliter-dosing liquid handling robotics. In this way, DMSO enabled diverse reagents to be arrayed in ∼1 μL droplets. Reactions were run at room temperature with no agitation and could be scaled up from the ∼0.05 mg reaction scale to the 1 g scale. Engineering enhancements enabled the use of ultraHTE with diverse and semivolatile solvents, photoredox catalysis, heating, and acoustic agitation. A main driver in the development of ultraHTE was the recognition of the opportunity for a direct merger between miniaturized reactions and biochemical assays. Indeed, a strategy was developed to feed ultraHTE reaction mixtures directly to a mass-spectrometry-based affinity selection bioassay. Thus, micrograms of starting materials could be used in the synthesis and direct biochemical testing of drug-like molecules. Reactions were performed at a reactant concentration of ∼0.1 M in an inert atmosphere, enabling even challenging transition-metal-catalyzed reactions to be used. Software to enable the workflow was developed. We recently initiated the mapping of reaction space, dreaming of a future where transformations, reaction conditions, structure, properties and function are studied in a systems chemistry approach.This article has not yet been cited by other publications.

General Data Management Workflow to Process Tabular Data in Automated and High throughput Heterogeneous Catalysis Research

Accelerating materials research with a comprehensive data management tool: a case study on an electrochemical laboratory

Automated Experimentation Powers Data Science in Chemistry.

Seamless Science: Lifting Experimental Mechanical Testing Lab Data to an Interoperable Semantic Representation

CKineticsDB─An Extensible and FAIR Data Management Framework and Datahub for Multiscale Modeling in Heterogeneous Catalysis

Advancing Catalysis Research through FAIR Data Principles Implemented in a Local Data Infrastructure - A Case Study of an Automated Test Reactor

High-throughput mechanistic screening of non-equilibrium inhibitors by a fully automated data analysis pipeline in early drug-discovery

ELNdataBridge: Facilitating Data Exchange and Collaboration by Linking Electronic Lab Notebooks via API

Data flow modeling, data mining and QSAR in high-throughput discovery of functional nanomaterials

An automated data analysis pipeline for GC-TOF-MS metabonomics studies.

echemdb Toolkit -- a Lightweight Approach to Getting Data Ready for Data Management Solutions

MetaDB a Data Processing Workflow in Untargeted MS-Based Metabolomics Experiments

State-of-the-Art Data Management: Improving the Reproducibility, Consistency, and Traceability of Structural Biology and in Vitro Biochemical Experiments

Data management in the modern structural biology and biomedical research environment

A robust CETSA data analysis automation workflow for routine screening

Calibration-Free Quantification and Automated Data Analysis for High-Throughput Reaction Screening

Ultrahigh-Throughput Experimentation for Information-Rich Chemical Synthesis

The Experiment Data Depot: A Web-Based Software Tool for Biological Experimental Data Storage, Sharing, and Visualization

The Evolution of Chemical High-Throughput Experimentation To Address Challenging Problems in Pharmaceutical Synthesis

Embracing data science in catalysis research