Quantitative measurement of clinic-genomic association for colorectal cancer using literature mining and Google-distance algorithm

Yaning Feng,Ling Zheng,Liying Song,Huilong Duan,Ning Deng
DOI: https://doi.org/10.1109/BMEI.2014.7002870
2014-01-01
Abstract:Nowadays, a growing number of researchers devote themselves to re-excavation of existing biomedical knowledge discovery, focusing on how to establish associations between clinical and genomic data. However, quantitative analysis is still inadequate for a particular disease. Colorectal cancer is the one of malignant tumors whose molecular mechanism is relatively clear, making it a more appropriate object of study. This paper proposed a quantitative measurement of clinic-genomic associations for colorectal cancer based on Google Distance, using MEDLINE database as the corpus. Our method is engineered with several technologies, including mapping clinic and genomic data to MeSH terms, modifying Normalized Google Distance using year average. Data from Electronic Medical Records (EMR), Online Mendelian Inheritance in Man (OMIM), and Genetic Association Database (GAD) were used in this study. A total of 3795 clinic-genomic associations of colorectal cancer between 67 clinical concepts and 236 genes were obtained, of which 584 associations were identified for their gene is contained in the colorectal cancer pathway using KEGG pathway analysis. Assessment and interpretation were conducted using KEGG, GeneCards, and then getting new discoveries. This method is valid in quantitative analysis using biomedical literature and achieves a good performance in measuring the clinical data and genomic data, which can be transplanted to other disease research.
What problem does this paper attempt to address?