Principle, Method and Application of Relationship Inference Based on Biological Networks
Shao LI,Peng ZHANG,Jin GU,Rui JIANG,Yanda LI
DOI: https://doi.org/10.1360/ssi-2021-0243
2022-01-01
Abstract:In the era of big biomedical data, systematically discovering key elements, including diseasecausinggenes and/or drug targets, and understanding the micro-level nature of macro-level phenotypes ina holistic fashion have remained a common challenge for information science, Western medicine, and traditional Chinese medicine (TCM). The key to overcoming thechallenge is how to solve the problems of multi-scale informationfusion and high-dimensionality, high-noise, and small-scale samples that exist in biomedical data, throughthe in-depth understanding of the “relationship" nature of complex biological systems (CBSs), as biology is a typicalcomplex system. Biological networks, as the basis of CBSs, reflect the interrelationshipsof various biological molecules such as genes and gene products in the human body, as well as those betweenbiological molecules and diseases and drugs at different levels. Biological networks have been widely used inbiomedical sciences analysis of data. We started the research on the relationship between Chinese and Westernmedicines and complex biological networks (CBNs) more than 20 years ago, and took the lead in proposing the hypothesis of a “network target",and proceeded with the method construction and application. In principle, this article uncovers a novel relationshipnamed as “multilevel modular relationship", between macro-level phenotypes and micro-level molecules based onCBNs, and discusses CBN-based “relationship inference". It reveals that the macro-level emergence haslocal modularity at the micro-level, and the more similar the macro-level phenotypes, the stronger the modularrelationships among micro-level molecules (disease-causing genes or drug targets). Methodologically, we furtherestablish a general CBN-based computational framework for the relationship inference to infer key elements frombig biomedical data with a small number of positive samples, from a global perspective. It consists of three parts:(1) relationship network construction, (2) relationship representation and modeling, and (3) unknown relationshipsinferring, with the aim of the substantialization, mathematicization and integration of relationships. At the applicationlevel, the relationship inference framework has shown good performance in predicting disease-causing genesand drug targets, identifying disease-related markers, and uncovering molecular mechanisms related toTCM. Thus, this framework has provided a systematic solution for comprehensively understanding the micro-levelnature of complex diseases and TCM. It has also provided important theoretical and methodological supports forsome emerging disciplines, including network pharmacology.