Interactive Knowledge-based Kernel PCA for Solvent Selection

Jonathan Hirst,Samuel Boobier,Joseph Heeley,Thomas Gaertner
DOI: https://doi.org/10.26434/chemrxiv-2024-nv0r4
2024-09-24
Abstract:Selecting more sustainable solvents is a crucial component to mitigating the environmental impacts of chemical processes. Numerous tools have been developed to address this problem within the pharmaceutical industry, employing data-driven approaches such as multidimensional scaling or principal component analysis (PCA). Interactive knowledge-based kernel PCA is a variant of PCA that allows users to shape 2D solvent maps by defining the positions of datapoints, imparting expert knowledge that was not included in the original descriptor set. We have applied interactive PCA to the task of solvent selection and present an intuitive interface that is integrated into AI4Green, an electronic laboratory notebook that encourages sustainable chemistry. A set of evidence-based user guidelines were developed and used in combination with the interactive PCA to identify four potential solvent substitutions for an example thioesterification reaction.
Chemistry
What problem does this paper attempt to address?
The paper aims to address the sustainability issue of solvent selection in chemical processes. Specifically, the authors developed an interactive knowledge-based kernel PCA method and applied it to the task of solvent selection. With this method, users can directly drag data points on a 2D solvent map based on experimental data, allowing for customized adjustments to the solvent mapping to better identify sustainable alternative solvents. Additionally, this method is integrated into an electronic lab notebook called AI4Green, which encourages green chemistry practices. The study also proposes a set of evidence-based user guidelines to instruct users on how to effectively utilize this tool to identify potential solvent alternatives. For example, in a case study on thioesterification reactions, the method successfully identified 4 possible solvent alternatives.