DNA-GCN: Graph Convolutional Networks for Predicting DNA-protein Binding.

Yuhang Guo,Xiao Luo,Liang Chen,Minghua Deng
DOI: https://doi.org/10.1007/978-3-030-84532-2_41
2021-01-01
Abstract:Predicting DNA-protein binding is an important and classic problem in bioinformatics. Convolutional neural networks have outperformed conventional methods in modeling the sequence specificity of DNA-protein binding. However, none of the studies has utilized graph convolutional networks for motif inference. In this work, we propose to use graph convolutional networks for motif inference. We build a sequence k-mer graph for the whole dataset based on k-mer co-occurrence and k-mer sequence relationship and then learn DNA Graph Convolutional Network(DNA-GCN) for the whole dataset. Our DNA-GCN is initialized with a one-hot representation for all nodes, and it then jointly learns the embeddings for both k-mers and sequences, as supervised by the known labels of sequences. We evaluate our model on 50 datasets from ENCODE. DNA-GCN shows its competitive performance compared with the baseline model. Besides, we analyze our model and design several different architectures to help fit different datasets.
What problem does this paper attempt to address?