Hitsz-cner: A hybrid system for entity recognition from Chinese clinical text

Hu Jianglu,Shi Xue,Liu Zengjian,Wang Xiaolong,Chen Qingcai,Tang Buzhou
2017-01-01
Abstract:With rapid development of electronic medical records, more and more attention has been attracted to reuse these data for research and commer-cial. As the entity recognition is one of the most primary task for medical in-formation extraction, the 2017 China conference on knowledge graph and se-mantic computing (CCKS) challenge sets up a track for clinical named entity recognition (CNER). The organizers provide 400 annotated Chinese medical records for this track, 300 out of them are used as a training set and 100 as a test set. Other 2,605 raw medical records are released as an unlabeled set. In this study, we develop a hybrid system based on rule, CRF (conditional random fields) and RNN (recurrent neural network) methods for the CNER task. Exper-iments on the official test set show that our system achieves the F1-scores of 91.08% and 94.26% under the "strict" and "relaxed" criteria respectively, rank-ing first in the 2017 CCKS CNER challenge. By applying a self-Training meth-od with unlabeled data, the F1-scores of all machine learning-based methods are improved by about 1.0% under "strict" criterion. The future work of us will focus on the more effective extraction of body, disease and treatment entities.
What problem does this paper attempt to address?