3D2SMILES: Translating Physical Molecular Models into Digital DeepSMILES Notations Using Deep Learning

Wenqi Guo,Yiyang Du,Mohamed Shehata

DOI: https://doi.org/10.26434/chemrxiv-2024-zvcb4

2024-08-05

Abstract:Physical molecular models are widely used in educational settings for teaching organic and other branches of chemistry, offering an intuitive way of understanding molecular structures. Conversely, virtual models, while less intuitive, provide additional functionalities such as the ability to retrieve molecular names and other properties. Currently, to the best of our knowledge, there is a gap between 3D molecular models and their digital counterparts. This paper introduces a computer vision model designed to bridge this gap by converting images of physical molecular models into their digital DeepSMILES representations. This conversion facilitates further information retrieval, enhancing educational utility. We developed both synthetic and real datasets to train our model and evaluated its performance across various dataset combinations, model architectures, and dataset sizes. Additionally, we attempted to improve the model's accuracy by multi-image input and beam search. We achieved 62.0\% top-1 accuracy and 80.3\% top-3 accuracy with beam search and multi-image input on our validation set. We also explored the model's characteristics, such as explainability by saliency maps, error analysis, and examined its calibration. We also discussed the model's limitations and directions for future research.

Chemistry

What problem does this paper attempt to address?

The main goal of this paper is to develop a computer vision model that can convert physical molecular models into their digital representations—specifically, into DeepSMILES notation. This conversion helps to extract more molecular information from physical models, enhancing their application value in the field of education. To achieve this goal, the researchers constructed two datasets: one is a computer-generated 3D molecular model dataset, and the other is a real-world 3D molecular model dataset. They also trained a model based on these datasets, which can convert images of physical molecular models into the corresponding Simplified Molecular Input Line Entry System (SMILES) notation. Additionally, the research team attempted to improve the accuracy of the output through multi-image input and beam search methods. Through experiments, the model achieved a top-1 accuracy of 62.0% and a top-3 accuracy of 80.3% on the validation set. The study also explored some characteristics of the model, such as interpretability, error analysis, and model calibration, and discussed the limitations of the model and directions for future research. Overall, this work aims to bridge the gap between physical molecular models and digital representations, providing more intuitive and functionally rich tools for chemical education.

3D2SMILES: Translating Physical Molecular Models into Digital DeepSMILES Notations Using Deep Learning

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

ABC-Net: a Divide-and-conquer Based Deep Learning Architecture for SMILES Recognition from Molecular Images.

Stepping Back to SMILES Transformers for Fast Molecular Representation Inference

Beyond Chemical 1D knowledge using Transformers

3D-Transformer: Molecular Representation with Transformer in 3D Space

SMG-BERT: integrating stereoscopic information and chemical representation for molecular property prediction

Learning to SMILES: BAN-based strategies to improve latent representation learning from molecules

SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery

DECIMER: towards deep learning for chemical image recognition

A Novel Molecular Representation Learning for Molecular Property Prediction with a Multiple SMILES-Based Augmentation

Towards 3D Molecule-Text Interpretation in Language Models

3D-Mol: A Novel Contrastive Learning Framework for Molecular Property Prediction with 3D Information

MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition

MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild

t-SMILES: A Scalable Fragment-based Molecular Representation Framework for De Novo Molecule Generation

CONSMI: Contrastive Learning in the Simplified Molecular Input Line Entry System Helps Generate Better Molecules

MolLM : a unified language model for integrating biomedical text with 2D and 3D molecular representations

DECIMER 1.0: deep learning for chemical image recognition using transformers

Improving the reliability of molecular string representations for generative chemistry