A Machine Learning-Guided Approach to Navigate the Substrate Activity Scope of Galactose Oxidase: Application in the Conversion of Pharmaceutically Relevant Bulky Secondary Alcohols
Shreyas Supekar,Dillon W. P. Tay,Wan Lin Yeo,Kwok Wai Eric Tam,Ying Sin Koo,Jie Yang See,Jhoann M.T. Miyajima,Sebastian Maurer-Stroh,Ee Lui Ang,Yee Hwee Lim,Hao Fan
DOI: https://doi.org/10.26434/chemrxiv-2024-j2tk2
2024-08-13
Abstract:Biocatalysis is increasingly being adopted in industry for producing important chemicals in a selective, efficient, and sustainable way. Engineering an enzyme can often confer it with an altered chemical scope, making it accessible to new and desirable chemistry. Identifying enzymes with the desired substrate specificity and activity, however, remains time-consuming and costly. Galactose oxidase (GOase) is a copper-dependent enzyme that coverts alcohols to their corresponding carbonyls, an important transformation in industrial synthesis. Here, we present a machine learning aided protocol to develop a catalytic activity prediction model (R2~0.7-0.9) for GOase based on a focused dataset of engineered GOase variants with activity toward bulky benzylic secondary alcohols. The trained GOase activity prediction models (with no additional training) also retained their predictive power when applied to another member of the oxidase family, an aryl-alcohol oxidase. Inspired by the fragment-based optimization methods used in drug discovery, we developed an active-site structure-aware substrate library for GOase. Experimental validation of a subset of the constructed substrate library indicates that the trained models provide good prediction (R2=0.61) of GOase activity, enabling the identification of the best GOase variant for each new substrate. This ability to identify optimal GOase variants for the synthesis of industrially important chemicals was demonstrated for Dyclonine, an FDA-approved drug. Our machine learning-guided approach enables rapid navigation of the substrate-activity scope of GOase, thereby reducing the burden of extensive experimental screening, and streamlining the deployment of biocatalysis in industrial synthesis.
Chemistry