Named Entity Recognition of Chinese Electronic Medical Records Based on Cascaded Conditional Random Field

Xiaoyu Chen,Shenghui Shi,Siyan Zhan,Daguang Jiang,Xiaoyong Lin
DOI: https://doi.org/10.1109/icbda.2019.8713244
2019-01-01
Abstract:Medical information carried by electronic medical records has high clinical application value, and named entity recognition is the key task to extract valuable information from a large number of electronic medical records. In order to realize the intelligent identification of named entities in Chinese electronic medical record texts this paper analyzes the characteristics that affect the recognition performance. The feature set is composed of language symbol features, part of speech features, context features, word boundary features and identifier feature. The feature template is designed and the Cascaded Conditional Random Field model is established. Based on the data selection strategy, we trained the Cascaded Conditional Random Field to identify the disease names, drug names and symptom names in Chinese electronic medical records. This method reduced the scale of training data, reduced the artificial markers and improved the recognition performance of disease names, drug names and symptom names, and the F value reached 68.18%, 90.91% and 87.13%, respectively.
What problem does this paper attempt to address?