Predicting Potential Gene Ontology from Cellular Response Data

Hao Hong,Xiaoyao Yin,Fei Li,Naiyang Guan,Xiaochen Bo,Zhigang Luo
DOI: https://doi.org/10.1145/3035012.3035015
2017-01-01
Abstract:Ontologies have proven to be useful for capturing and organizing knowledge as a hierarchical set of terms and their relationships. However, curating gene ontology data by hand requires specialized knowledge of certain field, which is inefficient. Thus inferring gene ontology from the exponentially increased biological data is getting hot. Based on the Library of Integrated Network-Based Cellular Signatures (LINCS) data we came up with the hypothesis that genes participate in analogous biological processes might affect cells accordantly. By assessing cellular response after genes were knock out we built a similarity matrix with the Gene Set Enrichment Analysis (GSEA) and clustered the genes with affinity propagation algorithm. Next we mapped the cluster result to gene ontology biological process data for annotation and enrichment analysis, which confirmed our hypothesis and made it possible to predict biological processes for unannotated genes from cellular response data after genes are knock out for the first time. We further validated the rationality from the gene ontology molecular function data.1
What problem does this paper attempt to address?