A Self-Supervised Framework for Learning Biological Entities Representation by Fusing Class Information

Nan Li,Hongfei Lin,Zhihao Yang,Jian Wang
DOI: https://doi.org/10.1109/JBHI.2023.3273333
IF: 7.7
2023-05-05
IEEE Journal of Biomedical and Health Informatics
Abstract:Ontologies are widely utilized in the biological domain for data annotation, integration, and analysis. Some representation learning methods have been proposed to learn the representation of entities to assist intelligent applications, such as knowledge discovery. However, most of them neglect the class information of entities in the ontology. In this article, we propose a unified framework, named ERCI, which jointly optimizes the knowledge graph embedding model and self-supervised learning. In this way, we can generate embeddings of bio-entities by fusing the class information. Moreover, ERCI is a pluggable framework that can be easily incorporated with any knowledge graph embedding model. We validate ERCI in two different ways. In the first way, we utilize the protein embeddings learned by the ERCI to predict protein-protein interactions on two different datasets. In the second way, we leverage the gene and disease embeddings generated by the ERCI to predict gene-disease associations. In addition, we create three datasets to simulate the long-tail scenario and evaluate ERCI on these. Experimental results show that ERCI has superior performance on all metrics compared with the state-of-the-art methods.
Computer Science,Biology,Medicine
What problem does this paper attempt to address?