IMPROVING CHINESE MEDICAL NAMED ENTITY RECOGNITION USING GLYPH AND LEXICON

SHANHAO ZHONG,QINGSONG YU
DOI: https://doi.org/10.12783/dtssehs/aeim2021/35969
2021-03-29
Abstract:Abstract. Medical named entity recognition is the first step in processing electronic medical records. It is the basis for processing medical natural language text information into medical structured information, which has extremely high research value and application value. In this paper, we have proposed a model that aims to identify various types of named entities such as disease, imaging examination, laboratory examination, operation, drug, and anatomy from Chinese electronic medical record. We construct a fusion Glyph and lexicon model based on BERT. Experimental studies have shown that increasing character-level semantic representation can improve the performance of named entity recognition. In order to boost it, the major measures of our model include: (1) a CNN structure is proposed to capture glyph information. (2) Soft-Lexicon method is introduced to encode lexicon information. Our models show an improvement over the baseline BERT-BiLSTM-CRF model. The experimental results on CCKS2019 dataset showed that the F1 score was 84.64, which was +1.99 higher than the baseline level.
What problem does this paper attempt to address?