Ontology-based Venous Thromboembolism Risk Assessment Model Developing from Medical Records

Yuqing Yang,Xin Wang,Yu Huang,Ning Chen,Juhong Shi,Ting Chen
DOI: https://doi.org/10.1186/s12911-019-0856-2
2018-01-01
Abstract:Padua linear model is widely used for the risk assessment of venous thromboembolism (VTE), which is a common and preventable complication for inpatients. However, differences of race, genetics and environment between Western and Chinese population limit Padua model' validity in Chinese patients. Extracting VTE risk factors from unstructured medical records in Chinese hospital can help to understand VTE events and develop efficient risk assessment model. In this study, we proposed an ontology-based method to mine VTE risk factors combining natural language processing (NLP) and machine learning (ML) methods. Medical records of 3106 inpatients were processed and terms in multiple ontologies from various sections of records enriched in VTE patients were sorted automatically. Then ML methods were used to estimate terms' importance and terms within admitting diagnosis and progress notes showed better VTE prediction performance than other sections. Finally a novel VTE prediction model was built based on selected terms and showed higher AUC score (0.815) than the Padua model (0.789).
What problem does this paper attempt to address?