Personalized Machine Learning Models of Terminal Olefin Hydroformylation for Regioselectivity Prediction

Hao Wang,Yuzhuo Chen,Hang Yu,Menghui Qi,De Xia,Minkai Qin,XuCheng Lv,Bing Lu,Ruiliang Gao,Yong Wang,Shanjun Mao
DOI: https://doi.org/10.1016/j.checat.2024.101079
2024-01-01
Chem Catalysis
Abstract:The integration of machine learning into hydroformylation processes represents a pivotal advancement in high-throughput screening within the chemical industry. This study employs a data- driven approach to develop predictive models for terminal olefin reactions. Using a database of 1,167 entries, we merged reaction embeddings with corresponding labels. The well-trained extreme gradient boosting model achieves a test set coefficient of determination (R2) 2 ) score of 0.897. However, when applied to specific-olefin tasks, the model shows suboptimal performance. Therefore, tailored models for specific olefins like 1-octene and styrene are developed, achieving improved test set R2 2 scores of 0.850 and 0.789, respectively, compared to the general-olefin task. Interpretability findings highlight the significance of high-temperature, lowpressure, and low-concentration metals in enhancing linear regioselectivity and providing chemical insights. This study underscores the transformative potential of machine learning as a surrogate model in advancing high-throughput screening and optimizing chemical processes in the industry.
What problem does this paper attempt to address?