Predicting odor from vibrational spectra: a data-driven approach
Durgesh Ameta,Laxmidhar Behera,Aniruddha Chakraborty,Tushar Sandhan
DOI: https://doi.org/10.1038/s41598-024-70696-w
2024-09-02
Abstract:This study investigates olfaction, a complex and not well-understood sensory modality. The chemical mechanism behind smell can be described by so far proposed two theories: vibrational and docking theories. The vibrational theory has been gaining acceptance lately but needs more extensive validation. To fill this gap for the first time, we, with the help of data-driven classification, clustering, and Explainable AI techniques, systematically analyze a large dataset of vibrational spectra (VS) of 3018 molecules obtained from the atomistic simulation. The study utlizes image representations of VS using Gramian Angular Fields and Markov Transition Fields, allowing computer vision techniques to be applied for better feature extraction and improved odor classification. Furthermore, we fuse the PCA-reduced fingerprint features with image features, which show additional improvement in classification results. We use two clustering methods, agglomerative hierarchical (AHC) and k-means, on dimensionality reduced (UMAP, MDS, t-SNE, and PCA) VS and image features, which shed further insight into the connections between molecular structure, VS, and odor. Additionally, we contrast our method with an earlier work that employed traditional machine learning on fingerprint features for the same dataset, and demonstrate that even with a representative subset of 3018 molecules, our deep learning model outperforms previous results. This comprehensive and systematic analysis highlights the potential of deep learning in furthering the field of olfactory research while confirming the vibrational theory of olfaction.