Identifying Symptom Clusters in Breast Cancer and Colorectal Cancer Patients using EHR Data.

Priyanka Gandhi,Xiao Luo,Susan Storey,Zuoyi Zhang,Zhi Han,Kun Huang
DOI: https://doi.org/10.1145/3307339.3342164
2019-01-01
Abstract:Patients with chronic conditions such as breast cancer and colorectal cancer often present with different symptoms, such 'fatigue', 'pain' and 'depression'. These symptoms add to patients' distress and functional impairment if left untreated. In this research, we investigate a symptom clustering and association mining framework to firstly extract and cluster the symptoms from the Electronic Health Record (EHR) clinical reports, then secondly to analyze the associations between symptom clusters and clinical attributes. The universal sentence coder and a modified seed based k-means algorithm are used for symptom coding and clustering. The results show that the symptom clusters have different associations between breast cancer and colorectal cancer, as well as for different time frames after chemotherapy. The results also show that breast cancer patients have slightly more symptoms from these three symptom clusters compared to the colorectal cancer patients within 12 months after the chemotherapy. Whereas, the colorectal cancer patient cohort has slightly more depression on average between 48 months and 54 months after the chemotherapy. Through applying the association rule mining, we find some informative rules, such as 'if a patient is at a higher cancer stage of colorectal cancer (3B), but no fatigue symptom, he or she likely doesn't have depression and peripheral neuropathy'. Our methods can be generalized to analyze symptom clusters of other chronic diseases where symptom management is critical.
What problem does this paper attempt to address?