Ontology-based Representations of Biological Entities

Fatima Zohra Smaili,Xin Gao,Robert Hoehndorh
2020-01-01
Abstract:Biomedical ontologies are widely used as a way to formally structure and represent knowledge in the biomedical field. Ontologies describe biological concepts and their relations through logical axioms and annotation properties (meta-data). The structure and information contained in biomedical ontologies and their annotations make them valuable for data analysis and knowledge extraction tasks. Despite being a rich source of biomedical information, ontologies are poorly unexploited by ontology-based analysis methods such as semantic similarity measures, which only use limited information from the ontologies. We propose two methods, Onto2Vec and OPA2Vec that can be used to generate vector representations of biological entities, by encoding most of the information in ontologies and their annotations. 1.Onto2Vec: We propose a method that learns dense-vector representations of biological entities based on logical axioms and ontology-based annotations of biological entities: Fig 1. Onto2Vec workflow Onto2Vec learns the vector representations in three steps: •Inferring new axioms using a semantic reasoner. •Representing entity-concept associations as axioms and merging them with the ontology axioms in the corpus. •Training Word2Vec on the ontology corpus. ● 2.OPA2Vec:  In addition to formal axioms, ontologies encode a rich meta-data in natural language describing different aspects of the biological concepts (e.g. labels, descriptions, …). This meta-data is completely unexploited by data analysis methods that use ontologies. OPA2Vec generates vector representations of biological entities by: • Combining formal …
What problem does this paper attempt to address?