A Unified View of Deep Learning for Reaction and Retrosynthesis Prediction: Current Status and Future Challenges

Ziqiao Meng,Peilin Zhao,Yang Yu,Irwin King
2023-06-28
Abstract:Reaction and retrosynthesis prediction are fundamental tasks in computational chemistry that have recently garnered attention from both the machine learning and drug discovery communities. Various deep learning approaches have been proposed to tackle these problems, and some have achieved initial success. In this survey, we conduct a comprehensive investigation of advanced deep learning-based models for reaction and retrosynthesis prediction. We summarize the design mechanisms, strengths, and weaknesses of state-of-the-art approaches. Then, we discuss the limitations of current solutions and open challenges in the problem itself. Finally, we present promising directions to facilitate future research. To our knowledge, this paper is the first comprehensive and systematic survey that seeks to provide a unified understanding of reaction and retrosynthesis prediction.
Machine Learning,Chemical Physics,Quantitative Methods
What problem does this paper attempt to address?
The paper primarily focuses on two core tasks in chemical synthesis—Reaction Prediction and Retrosynthesis Prediction—and conducts an in-depth study and review of these tasks. These two tasks are crucial for the drug discovery process, helping chemists design efficient synthesis pathways from simple raw materials to target molecules. ### Problems the Paper Attempts to Solve 1. **Unified Understanding**: For the first time, it provides a unified perspective on the tasks of reaction prediction and retrosynthesis prediction and conducts a comprehensive survey of both. This includes a systematic discussion of the design mechanisms, advantages, and disadvantages of existing methods. 2. **Evaluation of Advanced Models**: It summarizes the application of the current state-of-the-art deep learning models in reaction prediction and retrosynthesis prediction and conducts a comprehensive evaluation of them. 3. **Challenges and Limitations**: It explores the limitations and challenges in current solutions, including the insufficiency of datasets, limitations of evaluation metrics, and uncertainty estimation in non-autoregressive models. 4. **Future Directions**: Based on the current state of research, it proposes several future research directions to further enhance model performance. ### Core Contributions - **Unified Perspective**: For the first time, it analyzes reaction prediction and retrosynthesis prediction as a whole, emphasizing the interrelation and synergy between the two. - **Method Review**: It provides a detailed introduction to different methods, including template-based methods, sequence-based and graph-based autoregressive models, graph-based two-stage models, and graph-based non-autoregressive models, and analyzes their respective advantages and disadvantages. - **Challenges and Prospects**: It points out the shortcomings of existing methods in handling side products, dataset limitations, evaluation standards, etc., and looks forward to future research directions. Through these efforts, the paper aims to provide researchers in the field of chemical synthesis with a comprehensive and systematic guide, promoting the development of more efficient and accurate models in this field.