Parecat: Patient Record Subcategorization For Precision Traditional Chinese Medicine

Edward W. Huang,Sheng Wang,Runshun Zhang,Baoyan Liu,Xuezhong Zhou,Chengxiang Zhai
DOI: https://doi.org/10.1145/2975167.2975213
2016-01-01
Abstract:Traditional Chinese medicine (TCM), a style of medicine widely used in China for thousands of years, can complement modern western medicine by taking personalization as the core principle of clinical practice. A fundamental task in TCM, particularly important for achieving effective precision medicine, is to subcategorize patients with a general disease into groups corresponding to variations of that disease. In this paper, we conduct the first study of the problem of subcategorizing electronic patient records in TCM. While the general problem of subcategorization can be solved using basic clustering algorithms, accommodating variations in symptoms and herb prescriptions of TCM patient records when computing patient similarity is a major technical challenge that has yet to be addressed. To tackle this problem, we propose to learn inexact matchings of both symptoms and herbs from a TCM dictionary of herb functions by using an embedding algorithm. Our hypothesis is that the prior knowledge of herb-symptom associations in the TCM dictionary can be used to discover latent relationships among comorbid symptoms and functionally similar herbs, thereby improving the quality of subcategorization. We performed extensive experiments on large-scale real-world datasets. As expected, our approach leads to more accurate matchings between patient records than baseline approaches, and thus better subcategorization results.We also show that the proposed algorithm can be used immediately in multiple clinical applications, such as retrieving similar patients as well as discovering two special TCM cases: similar symptoms treated by different herbs and different symptoms treated by similar herbs.
What problem does this paper attempt to address?