Data Reconstruction Based on Temporal Expressions in Clinical Notes

Zhikun Zhang,Chunlei Tang,Joseph M. Plasek,Yun Xiong,Min-Jeoung Kang,Patricia C. Dykes,David W. Bates,Li Zhou
DOI: https://doi.org/10.1109/bibm47256.2019.8983207
2019-01-01
Abstract:Learning representations of clinical notes poses challenges in handling complex content that necessitates preprocessing steps to make the data more suitable for data mining. An important issue, addressed here, is that of temporal expressions, where cues indicate the time when clinical events occur. We present a three-step data reconstruction algorithm for transforming similar clinical entities (e.g., symptoms, complications) into sequential data through unsupervised annotation of temporal expressions. First, the data reconstruction algorithm detects if an expression has temporal intent. Second, it decomposes and rewrites the expression into non-temporal sub-expression and temporal constraints. Finally, it clusters similar non-temporal sub-expressions by using unsupervised sentence embedding under the modified K-medoids paradigm. We experimented with our proposed algorithm on clinical notes associated with chronic obstructive pulmonary disease (COPD). Visualizing reconstruction results of cardiology reports for a longitudinal cohort of patients with COPD demonstrated that this algorithm is feasible.
What problem does this paper attempt to address?