An Effective Standardization Method for the Lab Indicators in Regional Medical Health Platform Using N-grams and Stacking

Jiaying Zhang,Qi Wang,Zhixing Zhang,Yangming Zhou,Qi Ye,Huanhuan Zhang,Jiahui Qiu,Ping He
DOI: https://doi.org/10.1109/bibm.2018.8621274
2018-01-01
Abstract:Since 2008, a regional medical health platform has been built for managing electronic health records of top public hospitals in Shanghai. However, public hospitals often use different names to present a same laboratory examination item (or lab indicator) in this regional platform, which seriously hinders the interconnection and sharing of medical information among hospitals. In this paper, we propose an effective method to standardize the lab indicators using n-gram features and Stacking mechanism. Our proposed method sequentially combines a clustering model and a binary classification model. More specifically, we first cluster the lab indicators based on character uni-gram similarity distances to reduce the alignment scale, and then leverage a binary classification algorithm through Stacking mechanism based on character n-gram similarity features to generate candidate-standard indicator pairs iteratively. Experimental studies on the clinical data collected from eight top public hospitals in Shanghai show that our proposed method achieves a good performance 88.43% in terms of F 1 -score in the final binary classification, which is highly competitive performance compared to baseline methods.
What problem does this paper attempt to address?