Dual sentence representation model integrating prior knowledge for bio-text-mining

Zhijing Li,Yangyang Lan,Saikat Chatterjee,Pargorn Puttapirat,Xiangrong Zhang,Chen Li
DOI: https://doi.org/10.1109/BIBM49941.2020.9313239
2020-01-01
Abstract:Data mining, especially the extraction of the relationship between genes and proteins, plays an important role in the biomedical field. Several related models have been proposed for data mining in the biomedical domain. Furthermore, manually curated biomedical knowledge bases, which could assist the task, have been used to enhance the data-mining model. However, due to the limitation of methods, much prior knowledge information is not be fully exploited. In this work, we propose a novel method that reasonably applied the curated prior knowledge for biomedical text mining by dual sentence representation models; one model is for the experimental data and the other one is for the prior knowledge information sentence. We evaluated our method on two community-supported datasets; BioNLP and BioCreative corpora. The experimental results demonstrate that the dual sentence representation model can successfully utilize external prior knowledge information to extract relationship from biomedical text. Our method can achieve state-of-art results and it could be an application of biomedical relation extraction in the future.
What problem does this paper attempt to address?