ChemiRise: a data-driven retrosynthesis engine

Xiangyan Sun,Ke Liu,Yuquan Lin,Lingjie Wu,Haoming Xing,Minghong Gao,Ji Liu,Suocheng Tan,Zekun Ni,Qi Han,Junqiu Wu,Jie Fan
DOI: https://doi.org/10.48550/arXiv.2108.04682
IF: 2.552
2021-08-09
Chemical Physics
Abstract:We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-based one-step reaction proposer using template embeddings and developed a guiding algorithm on the directed acyclic graph (DAG) of chemical compounds to find the best candidate to explore. The atom-mapping algorithm and the one-step reaction proposer were benchmarked against previous studies and showed better results. The final product was demonstrated by retrosynthesis routes reviewed and rated by human experts, showing satisfying functionality and a potential productivity boost in real-life use cases.
What problem does this paper attempt to address?