Pseudo-Siamese Neural Network Based Graph and Sequence Representation Learning for Molecular Property Prediction.

Chaoran Zhang,Xiangfeng Yan,Yong Liu
DOI: https://doi.org/10.1109/bibm55620.2022.9994859
2022-01-01
Abstract:Molecular property prediction has received great attention due to its wide application in biomedical field. Effective molecular representation learning is of substantial significance to facilitate molecular property prediction. In recent years, with the development of artificial intelligence technology, more and more computer scientists began to apply deep learning methods to molecular property prediction instead of traditional machine learning methods. However, these methods only utilize the SMILES sequences to learn sequence representation or use the molecular graphs to learn graph representation to predict molecular property, which fails to integrate the capabilities of both approaches in preserving molecular characteristics for further improvement. In this study, we propose a joint graph and sequence representation learning model for molecular property prediction, called PSGS. Specifically, PSGS utilizes a fusion layer to combine graph and sequence representation and capture the critical features of the molecular. In addition, PSGS is trained by a new self-supervised task, which maximizes the similarity between graph and sequence representations of the same molecular by using a pseudo-Siamese neural network. We conduct extensive experiments to compare our model with state-of-the-art models. Experimental results show that our model significantly outperforms the current state-of-the-art methods on four independent datasets.
What problem does this paper attempt to address?