Revealing Potential Drug-Disease-gene Association Patterns for Precision Medicine
Wang Xuefeng,Zhang Shuo,Wu Yao,Yang Xuemei
DOI: https://doi.org/10.1007/s11192-021-03892-4
IF: 3.801
2021-01-01
Scientometrics
Abstract:Precision medicine means giving patients the right treatment at the right dose at the right time with minimum ill consequences and maximum efficacy. It is medicine personalized to the individual’s genes, environment, and lifestyle and, ultimately, its widespread use will require a deep understanding of the genomic variations that create predispositions or resistances to various diseases. Some of the links between genes and diseases are already known, and more are being discovered every day. Similarly, much is known about which drugs are efficacious for treating which diseases, but there is still more to learn. The issue now is how to extract this information from the biomedical literature in way that can keep pace with today’s rapid discoveries in medical research. Efforts to assemble an organized database of such knowledge to data have focused on mathematical statistic methods, computer-aided methods, etc. Success has been mixed as previous methods usually result in false positive or depend on training sample sets, lacking of generality in different research fields, which have choked advancements in precision medicine. To break through this bottleneck, we need novel methods that can extract and leverage the valuable information locked within the constraints of the data we have. Hence, in this paper, we present a new text-based computational framework for extracting full three-way drug-disease-gene triplet information related to colorectal cancer from biomedical texts. The framework consists of two main steps. The first is to construct an integrated drug-disease-gene network by extracting pair-wise associations between diseases, drugs, and genes, and then store unique drug-disease-gene triplets for further analysis. Since the constructed network is highly likely to be too sparse, the next step is to complete the incomplete links in the network, i.e., to predict novel links from genes to diseases to drugs. To validate our framework, we conducted a case study on colorectal cancer, mining the literature for drug-disease and disease-gene associations. An analysis of the subsequent inferences drawn between the two shows that this approach can help to inform novel research hypotheses and identify new knowledge triplets about various diseases, both of which are significant for the advancement and implementation of precision medicine.