Semantic output output-based disease-protein knowledge extraction

Zhi-heng LI,Zhi-hao YANG,Hong-fei LIN
DOI: https://doi.org/10.6040/j.issn.1671-9352.1.2015.025
2016-01-01
Abstract:With the rapid development of genomics and high-throughput technologies,large amount of biomedical litera-tures about genes and proteins appear.Meanwhile,the use of text mining technology discovery and excavation of new, valuable knowledge of protein from the mass of medical texts has become possible.This paper presents a system which extracts the relations between proteins and certain diseases from biomedical literature based on semantic output generated by SemRep,and then extracts novel,valuable protein knowledge.The system summarizes the salient relations by the salient extraction algorithm using the specific subject MEDLINE corpus.Subsequently,the results extracted by the sys-tem are compared with data from KEGG database.Implementation of the system has important significance for under-standing the major causes of many diseases,protein function prediction and drug-aided design.
What problem does this paper attempt to address?