Pre-training molecular representation model with spatial geometry for property prediction

Yishui Li,Wei Wang,Jie Liu,Chengkun Wu
DOI: https://doi.org/10.1016/j.compbiolchem.2024.108023
IF: 3.737
2024-04-01
Computational Biology and Chemistry
Abstract:AI-enhanced bioinformatics and cheminformatics pivots on generating increasingly descriptive and generalized molecular representation. Accurate prediction of molecular properties needs a comprehensive description of molecular geometry. We design a novel Graph Isomorphic Network (GIN) based model integrating a three-level network structure with a dual-level pre-training approach that aligns the characteristics of molecules. In our Spatial Molecular Pre-training (SMPT) Model, the network can learn implicit geometric information in layers from lower to higher according to the dimension. Extensive evaluations against established baseline models validate the enhanced efficacy of SMPT, with notable accomplishments in classification tasks. These results emphasize the importance of spatial geometric information in molecular representation modeling and demonstrate the potential of SMPT as a valuable tool for property prediction.
biology,computer science, interdisciplinary applications
What problem does this paper attempt to address?