Towards Automated Knowledge Discovery of Hepatocellular Carcinoma - Extract Patient Information from Chinese Clinical Reports.

Hongmei Yang,Lin Li,Ridong Yang,Yi Zhou
DOI: https://doi.org/10.1145/3239438.3239445
2018-01-01
Abstract:Objectives: To accurately determine significant prognostic risk factors, patient information must be quantified accurately according to their extent of disease. An essential step for prediction of prognostic risk factors requires the determination of patient features which are typically hidden in electronic medical record(EMR). The goal of this study is to extract clinical entities of Chinese clinical reports, enabling automated hepatocellular carcinoma knowledge extraction. Materials and Methods: In this paper, we annotated hepatocellular carcinoma corpora with patient records from EMR database. We present an information extraction solution based on assembled method. Our evaluation dataset contains 3996 training sentences and 1570 test sentences. The evaluation metrics are precision, recall, F1 of extract matching. Results and Conclusions: NER of admission reports, radiology reports and discharge summaries with F1 of 0.8449, 0.5935 and 0.7320 respectively. RE of overall F1 is 0.9129. This study prepares a foundation for larger population studies to identify clinical features of hepatocellular carcinoma.
What problem does this paper attempt to address?