Named Entity Recognition in Chinese Electronic Medical Records Based on CRF.

Kaixin Liu,Qingcheng Hu,Jianwei Liu,Chunxiao Xing
DOI: https://doi.org/10.1109/wisa.2017.8
2017-01-01
Abstract:Massive Electronic Medical Records (EMRs) contain a lot of knowledge and Named Entity Recognition (NER) in Chinese EMR is a very important task. However, due to the lack of Chinese medical dictionary, there are few studies on NER in Chinese EMR. In this paper, we first build a medical dictionary. We then investigated the effects of different types of features in Chinese clinical NER tasks based on Condition Random Fields (CRF) algorithm, the most popular algorithm for NER, including bag-of-characters, part of speech, dictionary feature, and word clustering features. In the experimental section, we randomly selected 220 clinical texts from Peking Anzhen Hospital. The experimental results showed that these features were beneficial in varying degrees to Chinese named entity recognition. Finally, after analyzing the experimental results, we get some rules of thumb.
What problem does this paper attempt to address?