A CRF-based Method for Automatic Construction of Chinese Symptom Lexicon

Meizhi Ju,Huilong Duan,Haomin Li
DOI: https://doi.org/10.1109/itme.2015.90
2015-01-01
Abstract:Lexicon plays a key role in Medical Language Processing (MLP) technology. Construction of semantic lexicon has become the prerequisite of MLP study in China where there are limited clinical terminology resources available. In this study, an iterative machine learning algorithm based on Conditional Random Field (CRF) was proposed aiming to automatically build a symptom lexicon from clinical corpus. Comprehensive evaluation was conducted in terms of exact and inexact for the algorithm. The algorithm achieved the performance, with F-measure of 87.23%, precision and recall were 99.95% and 72.23%, respectively. Furthermore, a lexicon which contained 22,501 symptoms was constructed based on this approach.
What problem does this paper attempt to address?