A Graph-based Approach for Integrating Biological Heterogeneous Data Based on Connecting Ontology.
Shilong Zhang,Yue Tang,Jing Yan,Linye Li,Tong Li,Jixiang Li,Peilin Xie,Yuanshuai Gu,Jiakang Xu,Zaiwen Feng,Wen Zhang,Jingbo Xia,Wolfgang Mayer,Hong-Yu Zhang,Guang-Cun He,Keqing He
DOI: https://doi.org/10.1109/bibm52615.2021.9669700
2021-01-01
Abstract:Linked Open Data (LOD) is an ongoing effort in the Semantic Web community to build a massive public knowledge graph. The goal is to extend the Web by publishing various open datasets as RDF on the Web and then linking data items to other useful information from different data sources. With linked data, starting from a certain point in the graph, a person or machine can explore the graph to find other related data. In this paper, we develop a novel pipeline for graph-based biological data integration. By using our pipeline, users can easily glue heterogeneous biological ontologies, annotate sources with multiple join tables effectively, obtain a high-quality biological knowledge graph automatically, and enrich the knowledge graph with public biological ontologies finally. We implement a platform that realizes the proposed approach and conduct two case studies to evaluate the effectiveness and efficiency of our approach.