CTsynther: Contrastive Transformer model for end-to-end retrosynthesis prediction

Hao Lu,Zhiqiang Wei,Kun Zhang,Xuze Wang,Liaqat Ali,Hao Liu
DOI: https://doi.org/10.1109/TCBB.2024.3455381
2024-09-06
Abstract:Retrosynthesis prediction is a fundamental problem in organic chemistry and drug synthesis. We proposed an end-to-end deep learning model called CTsynther (Contrastive Transformer for single-step retrosynthesis prediction model) that could provide single-step retrosynthesis prediction without external reaction templates or specialized knowledge. The model introduced the concept of contrastive learning in Transformer architecture and employed a contrastive learning language representation model at the SMILES sentence level to enhance model inference by learning similarities and differences between various samples. Mixed global and local attention mechanisms allow the model to capture features and dependencies between different atoms to improve generalization. We further investigated the embedding representations of SMILES learned automatically from the model. Visualization results show that the model could effectively acquire information about identical molecules and improve prediction performance. Experiments showed that the accuracy of retrosynthesis reached 53.5% and 64.4% for with and without reaction types, respectively. The validity of the predicted reactants is improved, showing competitiveness compared with semi-template methods.
What problem does this paper attempt to address?