A Combined Classification Model for Chinese Clinical Notes

Jun Liang,Xiaowei Zheng,MeiFang Xu,Xiaojun He,YaWen Xing
2013-01-01
Abstract:Patient's drinking played an important role in a variety of health problems research and diagnostic. A small amount of alcohol information usually hidden in the mass of unstructured narrative EMR text, which required sophisticated techniques for extraction and classification. This article described the extraction and classification methods of patient’s drinking that combined with rule-based and machine-learning-techniques-based, and assessed the performance of the system. System got a 87.30% F measure on the training dataset and an 82.60% F measure on the test dataset, using the macro-average F-score, the common evaluation metric used in NLP areas and medical statistics areas. The experimental result shows that we can use machine-learning-models to replace labor-intensive steps in the rule-based system, in addition, the hybrid system retained some of the advantages of rule-based system, for example, high performance. Therefore, this system can be applied to drinking risk factors screening of which the study people is based on massive EMR data in epidemiological studies.
What problem does this paper attempt to address?