Traditional Chinese Medicine Clinical Records Classification Using Knowledge-Powered Document Embedding

Liang Yao,Yin Zhang,Baogang Wei,Zherong Li,Xiangzhou Huang
DOI: https://doi.org/10.1109/bibm.2016.7822817
2016-01-01
Abstract:Text classification is one of the fundamental tasks in text mining. In the medical domain, there have been a number of studies on text classification in modern medicine clinical notes written in English. However, very limited text classification research has been conducted on clinical notes written in Chinese, especially traditional Chinese medicine (TCM) clinical records. The goal of this study was to investigate features and machine learning classification algorithms for TCM clinical text classification. We collected 7,037 TCM clinical records of famous TCM doctors as our dataset, and investigated the effects of different types of features and classification algorithms. Additionally, we proposed a novel method to combine deep learning text representation with TCM domain knowledge, which results in the best classification performance.
What problem does this paper attempt to address?