Deep Learning Based Information Extraction Framework on Chinese Electronic Health Records

Bing Tian,Yong Zhang,Kaixin Liu,Chunxiao Xing
DOI: https://doi.org/10.18293/seke2018-040
2018-01-01
Abstract:Electronic Health Records (EHRs) store a large amount of clinical data associated with each patient. Information extraction on unstructured clinical notes in EHRs is important which could contribute to huge improvement in patient health management. Previous studies mainly focused on English corpus. However, at the same time there are very limited research work on Chinese EHRs. Due to the challenges brought by the characteristics of Chinese, it is difficult to apply existing techniques for English on Chinese corpus. In this paper, we propose a deep learning based framework for information extraction from clinical notes in Chinese EHRs. Our framework consists of three components: data preprocessing, feature generation and entity and relation extractor. For clinical entity recognition, we propose a novel Conditional Random Field (CRF) based model and introduce effective features by leveraging the characteristics of Chinese language. For relation extraction, we utilize Convolutional Neural Network (CNN)to obtain high quality entity-relation facts. To the best of our knowledge, this is the first framework to apply deep learning to information extraction from clinical notes in Chinese EHRs. We conduct extensive sets of experiments on real-world datasets from hospital. The experimental results show the effectiveness of our framework, indicating its practical application value.
What problem does this paper attempt to address?