Abstract:Precision medicine means giving patients the right treatment at the right dose at the right time with minimum ill consequences and maximum efficacy. It is medicine personalized to the individual’s genes, environment, and lifestyle and, ultimately, its widespread use will require a deep understanding of the genomic variations that create predispositions or resistances to various diseases. Some of the links between genes and diseases are already known, and more are being discovered every day. Similarly, much is known about which drugs are efficacious for treating which diseases, but there is still more to learn. The issue now is how to extract this information from the biomedical literature in way that can keep pace with today’s rapid discoveries in medical research. Efforts to assemble an organized database of such knowledge to data have focused on mathematical statistic methods, computer-aided methods, etc. Success has been mixed as previous methods usually result in false positive or depend on training sample sets, lacking of generality in different research fields, which have choked advancements in precision medicine. To break through this bottleneck, we need novel methods that can extract and leverage the valuable information locked within the constraints of the data we have. Hence, in this paper, we present a new text-based computational framework for extracting full three-way drug-disease-gene triplet information related to colorectal cancer from biomedical texts. The framework consists of two main steps. The first is to construct an integrated drug-disease-gene network by extracting pair-wise associations between diseases, drugs, and genes, and then store unique drug-disease-gene triplets for further analysis. Since the constructed network is highly likely to be too sparse, the next step is to complete the incomplete links in the network, i.e., to predict novel links from genes to diseases to drugs. To validate our framework, we conducted a case study on colorectal cancer, mining the literature for drug-disease and disease-gene associations. An analysis of the subsequent inferences drawn between the two shows that this approach can help to inform novel research hypotheses and identify new knowledge triplets about various diseases, both of which are significant for the advancement and implementation of precision medicine.

Bridging Heterogeneous Mutation Data to Enhance Disease Gene Discovery

Heterogeneous Network Embedding Enabling Accurate Disease Association Predictions

Identification of Alzheimer's Disease-Related Genes Based on Data Integration Method

Deepdga: Biomedical Heterogeneous Network-based Deep Learning Framework for Disease-Gene Association Predictions.

A Computational Method Based On The Integration Of Heterogeneous Networks For Predicting Disease-Gene Associations

Heterogeneous biomedical entity representation learning for gene–disease association prediction

Integrate GWAS, Eqtl, and Mqtl Data to Identify Alzheimer’s Disease-Related Genes

A novel transformer-based aggregation model for predicting gene mutations in lung adenocarcinoma

Searching Genome-Wide Multi-Locus Associations for Multiple Diseases Based on Bayesian Inference.

HDAM: a Resource of Human Disease Associated Mutations from Next Generation Sequencing Studies

Revealing Potential Drug-Disease-gene Association Patterns for Precision Medicine

MGDHGS : Gene-bridged metabolite-disease relationships prediction via GraphSAGE and self-attention mechanism

Integration of multi-source gene interaction networks and omics data with graph attention networks to identify novel disease genes

Mining disease-associated genes based on heterogeneous graph transformer.

DAM: A Bayesian Method for Detecting Genome-wide Associations on Multiple Diseases.

AlzDiscovery: A computational tool to identify Alzheimer's disease-causing missense mutations using protein structure information

Predicting disease genes based on multi-head attention fusion

Inferring Gene-Disease Association by an Integrative Analysis of eQTL Genome-Wide Association Study and Protein-Protein Interaction Data.

gnizing Disease-Associated Bayes-Guided Neural-uilt on Low-Resolution tion of Proteins and Interactions

Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes

A Statistical Framework for Mapping Risk Genes from De Novo Mutations in Whole-Genome-Sequencing Studies