TCM Clinic Records Data Mining Approaches Based on Weighted-Lda and Multi-Relationship LDA Model

Fan Lin,Jianbing Xiahou,Zhuxiang Xu
DOI: https://doi.org/10.1007/s11042-016-3363-9
IF: 2.577
2016-01-01
Multimedia Tools and Applications
Abstract:As an important part of traditional medicine, TCM (Traditional Chinese Medicine) has unique and distinct clinical effects in the aspect of disease diagnosis and treatment. Thousands of years of TCM treatment has accumulated abundant clinical data and medical literatures, including valued TCM theories and clinical practice rules. Researchers have conducted various methods such as clustering analysis, association rules and regression analysis to study TCM theory. However, none of them could reflect well the semantic complexity of TCM and systemic characteristics of TCM treatment. This paper conducted a research on the inherent rules of TCM clinic records with a topic model. On the basis of LDA model, weighted mechanism was adopted for each feature word to improve the distinguishing ability and interpretability between the topics. Meanwhile, the modeled topic is taken as the feature for the classification of SVM (Support Vector Machine) to improve the classification accuracy. The topic number of LDA topic model is confirmed by the KL distance and similarity between the topics. After analyzing the relationship between topic model and TCM differentiation and treatment, MULTI-RELATIONSHIP Topics LDA MODEL was proposed on the basis of LDA model and Author-topic model to automatically extract the topic structures between the four parties and explore the relationship of the multiple parties with clinical significance. In the meantime, relevancy between the parties and the feature word weighted mechanism are used to improve the MULTI-RELATIONSHIP Topics LDA MODEL and the classification accuracy of the topics. The experiments showed that analysis of clinical data with topic model can extract TCM treatment rules and provide a novel theoretical method for TCM clinical research.
What problem does this paper attempt to address?