Biomedical Causal Relation Extraction Via Data Augmentation and Multi-source Knowledge Fusion

Jing Hao,Lishuang Li,Xueyang Qin
DOI: https://doi.org/10.1109/bibm58861.2023.10386007
2023-01-01
Abstract:Biomedical causal relation extraction (BCRE) as a sub-task of biomedical information extraction aims to extract event causal relation facts from unstructured biomedical texts and plays an important role in some downstream tasks. The existing methods usually apply oversampling to solve the problems caused by the unbalanced distribution and limited knowledge of the datases, which may ignore the sample diversity. In addition, they usually encode the text by the pre-trained language model BioBERT, which can only obtain context information of the text and may limit the performance because of the insufficiently extracted text information. To solve the above mentioned problems, in this paper, we propose a Multi-source Knowledge Fusion Network (MKFN) to augment data as well as sufficiently extract and fuse the text information and the external knowledge for biomedical causal relation extraction. Specifically, we apply the large language model Roberta to augment samples in minority classes and filter the external knowledge from multiple knowledge bases with the relevance of the text to triples captured by the structure information. Afterward, the multi-source knowledge embedding including context information, structure information and the corresponding external knowledge is acquired by various different encoders. Additionally, we utilize the triplet attention, which is introduced into event relation extraction for the first time, to fuse the multi-source knowledge embedding. Extensive experimental results on Hahn-Powell’s and BioCause datasets confirm that the proposed method achieves novel state-of-the-art performance compared with the current advances.
What problem does this paper attempt to address?