Carbon-based molecular properties efficiently predicted by deep learning-based quantum chemical simulation with large language models

Haoyu Wang,Bin Chen,Hangling Sun,Yuxuan Zhang
DOI: https://doi.org/10.1016/j.compbiomed.2024.108531
IF: 7.7
2024-05-10
Computers in Biology and Medicine
Abstract:The prediction of thermodynamic properties of carbon-based molecules based on their geometrical conformation using fluctuation and density functional theories has achieved great success in the field of energy chemistry, while the excessive computational cost provides both opportunities and challenges for the integration of machine learning. In this work, a deep learning-based quantum chemical prediction model was constructed for efficient prediction of thermodynamic properties of carbon-based molecules. We constructed a novel framework — encoding the 3D information into a large language model (LLM), which in turn generates a 2D SMILES string, while embedding a learnable encoding designed to preserve the integrity of the original 3D information, providing better structural information for the model. Additionally, we have designed an equivariant learning module to encompass representations of conformations and feature learning for conformational sampling. This framework aims to predict thermodynamic properties more accurately than learning from 2D topology alone, while providing faster computational speeds than conventional simulations. By combining machine learning and quantum chemistry, we pioneer efficient practical applications in the field of energy chemistry. Our model advances the integration of data-driven and physics-based modeling to unlock novel insights into carbon-based molecules.
engineering, biomedical,computer science, interdisciplinary applications,mathematical & computational biology,biology
What problem does this paper attempt to address?