Machine learning-based yield prediction for transition metal-catalyzed cross-coupling reactions
C. Rajalakshmi,Vivek Vijay,Abhirami Vijayakumar,Shajila Salim,Sherin Susan Cherian,Parvathi Santhoshkumar,John B. Kottooran,Ann Miriam Abraham,G. Krishnaveni,C. S. Anjanakutty,Binuja Varghese,Vibin Ipe Thomas
DOI: https://doi.org/10.1007/s00214-024-03159-0
2024-12-04
Theoretical Chemistry Accounts
Abstract:The advent of transition metal-catalyzed cross-coupling reactions has marked a significant milestone in the field of organic chemistry, primarily due to their pivotal role in facilitating the construction of carbon–carbon and carbon–heteroatom bonds. Traditionally, yield determination in cross-coupling reactions has predominantly relied on experimental methods. However, recent advancements in machine learning (ML) algorithms have revolutionized yield prediction through the use of predictive models. While the prior studies have primarily concentrated on homogeneous datasets of cross-coupling reactions, the accurate prediction of yields for heterogeneous datasets possess a formidable challenge. To address this issue, this study aims to develop machine learning models for yield prediction by curating extendable, open-access heterogeneous datasets of transition metal-catalyzed cross-coupling reactions. In our study, we employed both regression and classification models, leveraging various featurization methods. Among them, the DRFP featurized random forest model is found to have better predictive performance obtaining an R 2 value of 0.79 over the neural network and KNN models. By identifying suitable machine learning models for yield prediction, this study contributes to the development of predictive models for sustainable transition metal catalysis.
chemistry, physical