subGE: Enhancing the subgraph representation of molecular compounds structure–activity relationship discovery

Xiaoyu Chen,Quan Qian
DOI: https://doi.org/10.1016/j.engappai.2022.105727
IF: 8
2022-12-25
Engineering Applications of Artificial Intelligence
Abstract:Prediction of the molecular compound structure–activity relationship is one of the most critical tasks in computer-assisted drug design. To accurately predict the properties of molecular compounds and explain their structure–activity relationships, we proposed a subgraph embedding model, subGE, based on reinforcement learning and mutual information mechanisms. First, molecular compounds were abstracted into graphs, and the original graphs were sampled using a breadth-first search. These subgraphs were then encoded using graph neural networks and converted into graph embeddings. Reinforcement learning was introduced to reduce the dimensionality of the subgraph embeddings and filter out significant subgraphs. A mutual information mechanism was introduced to further enhance the ability of the filtered subgraphs to characterize a full graph. SubGE was evaluated based on three open-source datasets, BBBP, Bace, and Clintox from the DeepChem package developed by MoleculeNet. The experimental results showed that subGE achieved accuracies of 86.61%, 80.49%, 95.81%, and 96.34% for four classification tasks with three datasets. These values represent improvements of 16.87%, 19.29%, 3.91%, and 3.73%, respectively, compared to that of existing graph convolutional networks, and of 8.34%, 6.64%, 5.53%, and 5.50%, respectively, compared to that of the direct encoding of subgraphs without introducing reinforcement learning and mutual information mechanisms. The subgraphs extracted by subGE could fully explain the conformational relationships of compounds through visualization.
automation & control systems,computer science, artificial intelligence,engineering, electrical & electronic, multidisciplinary
What problem does this paper attempt to address?