Exploiting the concept level feature for enhanced name entity recognition in Chinese EMRs

Qing Zhao,Dan Wang,Jianqiang Li,Faheem Akhtar
DOI: https://doi.org/10.1007/s11227-019-02917-3
IF: 3.3
2019-06-10
The Journal of Supercomputing
Abstract:The accumulation and explosive growth of the electronic medical records (EMRs) make the name entity recognition (NER) technologies become critical for the meaningful use of EMR data and then the practice of evidence-based medicine. The dominate NER approaches use the distributed representation of the words and characters to build deep learning-based NER models. However, for the task of biomedical named entity recognition, there are a large amount of complicated medical terminologies that are composed of multiple words. Splitting these terminologies to learn the word and character embeddings might cause semantic ambiguities. In this paper, we treat each medical terminology as a concept and propose a concept-enhanced named entity recognition model (CNER), where the features from three different granularities (i.e., concept, word, and character) are combined together for bio-NER. The extensive experiments are conducted on two real-world corpora: fully labeled corpus and partially labeled corpus. CNER achieves the highest F1 score (fully labeled corpus: precision = 88.23, recall = 88.29, and F1 = 88.26; partially labeled corpus: precision = 87.03, recall = 88.19, and F1 = 87.61) by outperforming the baseline CW-BLSTM-CRF approach for 0.58% and 1.15% respectively, which demonstrates the effectiveness of the proposed approach.
What problem does this paper attempt to address?