A tag based joint extraction model for Chinese medical text

XingYu Liu,Yu Liu,HangYu Wu,QingQuan Guan
DOI: https://doi.org/10.1016/j.compbiolchem.2021.107508
IF: 3.737
2021-08-01
Computational Biology and Chemistry
Abstract:<p>Information extraction in medical field is an important method to structure medical knowledge and discover new knowledge. Traditional methods handle this task in a pipelined manner regarding the entity recognition and relation extraction as two sub-tasks, which, however, neglects the relevance between the two of them. In recent years, the research on the joint extraction model has achieved encouraging results in the general field, yet scholarship focusing on the joint extraction model applied to medical field is insufficient. In this paper, we construct a joint extraction model based on tagging scheme for Chinese medical texts. Firstly, we design a series of pretreatment procedures for Chinese medical data to obtain effective Chinese word sequence. Then, we propose the BIOH12D1D2 tagging scheme to convert the joint extraction task into a tagging problem and to solve the overlapping entity problem. After that, we use the encoder-decoder model to obtain the tag prediction sequence. And in decoding layer, the Bert pre-training model is adopted to extract token features to enhance the feature representation ability of our model. Lastly, the joint extraction model gains a F1 value by 0.7 on CHIP-2020, which increases by 0.364 compared with the baseline.</p>
biology,computer science, interdisciplinary applications
What problem does this paper attempt to address?