A Machine Learning Protocol for Geometric Information Retrieval from Molecular Spectra

Shijie Tao,Yi Feng,Wenmin Wang,Tiantian Han,Pieter E.S. Smith,Jun Jiang
DOI: https://doi.org/10.1016/j.aichem.2023.100031
2024-01-01
Abstract:Geometric information of molecules is closely related to their properties, and vibrational spectroscopy, as a common and powerful analytical tool for determining molecular structure, can assist in gaining precise geometric information. Traditional methods used to delineate spectrum-structure correlations are often expensive, time-consuming, and require extensive professional expertise. In this work, we used a machine learning protocol to construct a map from spectra to molecular geometric structures, and employed Grad-CAM, a convolutional network interpretation technology, to analyze which kinds of chemical information are important for determining our model’s results. The results obtained for six small molecules of differing structures demonstrate that the model is capable of (1) extracting the crucial spectral features that are vital to downstream tasks without necessitating any manual preprocessing, and (2) enabling retrieval of molecular structural information with high precision.
What problem does this paper attempt to address?