A Bootstrapping Approach to Symptom Entity Extraction on Chinese Electronic Medical Records.

Tianyi Qin,Yi Guan
DOI: https://doi.org/10.1007/978-3-319-47674-2_34
2016-01-01
Abstract:Symptom entities are widely distributed in Chinese electronic medical records. Previous approaches on symptom entity extraction usually extract continuous strings as symptom entities and require massive human efforts on corpus annotation. We describe the symptom entity as two-tuples of <subject, lesion> and design a soft pattern matching method to locate them in sentences in the EMR. Our bootstrapping approach which only requires a few annotated symptom tuples and it allows iterative extraction from mass electronic medical record databases without human supervision. Furthermore, the described method annotates symptom entities in EMR by the extracted tuples. Starting with 60 annotated entities, our approach reached an F value of 81.40 % in the extraction task of 3,150 entities from 992 sets of electronic medical records.
What problem does this paper attempt to address?