Term Extraction and Negation Detection Method in Chinese Clinical Document

李昊旻,李莹,段会龙,吕旭东
DOI: https://doi.org/10.3969/j.issn.0258-8021.2008.05.016
2008-01-01
Abstract:Narrative clinical documents contain a wealth of information for medical study.Indexing these documents using concepts in a biomedical terminology can improve information retrieval and mining in medical records.International studies in this domain have developed for several years,but the study based on Chinese clinical document remains a blank.After analyzing special character of Chinese medical language,this paper integrated Chinese version of International Classification of Disease(ICD) to the Unified Medical Language System(UMLS) terminology system and proposed a set of term extraction and negation detection method for Chinese clinical document which could be used to build concept-based index for documents.In the term extract phase the high-sensitivity Reverse Maximum Matching(RMM) method was used and a general Chinese word segmentation tool was used to decline false positive results.In negation detection phase,a simplified syntax pattern matching was proposed.Two algorithms were tested and evaluated in 2 clinical documents data sets.Term extract algorithm had a sensitivity of 99.51% and 100% while wrong detection rate 1.46% and 1.66%.Both negation detection algorithms had a positive predictive value of 100%,and negative predictive values of 100% and 98.99%.The negation detection algorithm could perfectly work except unusual punctuation used in clinical documents.
What problem does this paper attempt to address?