Predicting Infrared Spectra with Message Passing Neural Networks

Charles McGill,Michael Forsuelo,Yanfei Guan,William H. Green
DOI: https://doi.org/10.1021/acs.jcim.1c00055
IF: 6.162
2021-05-28
Journal of Chemical Information and Modeling
Abstract:Infrared (IR) spectroscopy remains an important tool for chemical characterization and identification. Chemprop-IR has been developed as a software package for the prediction of IR spectra through the use of machine learning. This work serves the dual purpose of providing a trained general-purpose model for the prediction of IR spectra with ease and providing the Chemprop-IR software framework for the training of new models. In Chemprop-IR, molecules are encoded using a directed message passing neural network, allowing for molecule latent representations to be learned and optimized for the task of spectral predictions. Model training incorporates spectra metrics and normalization techniques that offer better performance with spectral predictions than standard practice in regression models. The model makes use of pretraining using quantum chemistry calculations and ensembling of multiple submodels to improve generalizability and performance. The spectral predictions that result are of high quality, showing capability to capture the extreme diversity of spectral forms over chemical space and represent complex peak structures.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.1c00055?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.1c00055</a>.Data access statement as well as additional analysis of the computational dataset; peak broadening procedure; SIS loss function; phase performance; ensemble SIS; loss function and fingerprint baseline comparisons; polynomial expansion of SID; and a listing of SMILES strings associated with molecule diagrams (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.1c00055/suppl_file/ci1c00055_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems
What problem does this paper attempt to address?