A Survey on Graph Embedding Techniques for Biomedical Data: Methods and Applications
Yaozu Wu,Yankai Chen,Zhishuai Yin,Weiping Ding,Irwin King
DOI: https://doi.org/10.1016/j.inffus.2023.101909
IF: 18.6
2023-01-01
Information Fusion
Abstract:As a result of the expeditious advancement of biomedical technologies, a plethora of relational data linking biomedical entities such as genes, proteins, and drugs have been collected for modern biomedical research. Biomedical graphs, one of the most popular ways to represent relational data, can easily describe different complex biomedical systems, including molecular-level, multi-omics-level, therapeutics-level, and healthcare-level interactions. However, traditional graph analysis methods still suffer from two difficulties (i.e., heterogeneities and dynamic properties of biomedical graphs) when handling high-dimensional, multi-modal, and sparsely interconnected biomedical data. To address these issues, graph embedding methods that can effectively analyze biomedical graphs have received a significant amount of attention recently. Generally, graph-based data is converted into a low-dimensional vector space with its structural properties and well-reserved information. These vectorized representations are used for further computation in various downstream biomedical tasks, such as gene function prediction and drug–target interaction prediction. In this article, we focus on the application of graph data in the biomedical domain and mainly introduce recent developments of graph embedding techniques (homogeneous, heterogeneous, and dynamic graph embedding techniques), including methodologies and related biomedical tasks. We also summarize relevant biomedical datasets and open-source implementations. We further discuss existing limitations and potential solutions. We hope this survey can provide useful directions for researchers who are interested in using graph embedding methods to solve problems in the biomedical field.