DeepGOA: Predicting Gene Ontology Annotations of Proteins Via Graph Convolutional Network

Guangjie Zhou,Jun Wang,Xiangliang Zhang,Guoxian Yu
DOI: https://doi.org/10.1109/bibm47256.2019.8983075
2019-01-01
Abstract:Gene Ontology (GO) uses a series of standardized and controlled GO terms to describe the molecular functions, biological process roles and cellular locations of gene products (i.e., proteins and RNAs), it structurally organizes GO terms in a direct acyclic graph (DAG). GO has more than 40000 terms and each protein is only annotated with several or dozens of these terms. It is a difficult challenge to accurately annotate relevant GO terms to a protein from such a large number of candidate GO terms. Some deep learning models have been proposed to utilize the GO hierarchy for protein function prediction, but they inadequately utilize GO hierarchy. To use the knowledge encoded in the GO hierarchy, we propose a deep Graph Convolutional Network (GCN) based model (DeepGOA) to predict GO annotations of proteins. DeepGOA firstly utilizes GO annotations and hierarchy to measure the correlations between GO terms and to accordingly update the edge weights of the DAG, and then applies GCN on the updated DAG to learn the semantic representation and latent inter-relations of GO terms. At the same time, it uses Convolutional Neural Network (CNN) to learn the feature representation of amino acids sequences with respect to the semantic representations. After that, DeepGOA computes the dot product of two representations, which enables training the whole network end-to-end in a coherent fashion. Experiments on two model species (Human and Corn) show that DeepGOA outperforms the state-of-the-art deep learning based methods. The ablation study proves that GCN can employ the knowledge of GO and boost the performance.
What problem does this paper attempt to address?