Utility Balanced Classification for Automatic Electronic Medical Record Analysis

Liansheng Wang,Qiuhao Xu,Shuo Li
DOI: https://doi.org/10.1109/icsai.2018.8599490
2018-01-01
Abstract:Imbalanced data classification is a critical issue and plays an important role in data analysis, especially in automatic clinical diagnosis and treatment. However, since practical applications, for example, clinical data analysis usually have high complexity and diversity, conventional classification method suffers from huge cost and high unreliability while facing complex clinical data. Therefore it is challenging to obtain an effective, reliable, and precise method. In this paper, we propose a utility balanced classifier (UBC) for diagnostic prediction from electronic medical record automatically. Our UBC introduces two novel innovations to handle imbalanced data: (1) the concept of utility describing the effectiveness of the data during classification, which effectively handles the nonlinear relationship between medical record features and quantitative evaluation parameters. (2) the application of the focal loss processing imbalanced data, which plays an important role to correct the mislabeled data. Experiments show our method achieves high accuracy on a comprehensive clinical dataset, which indicates its huge practical value in clinical diagnosis and treatment.
What problem does this paper attempt to address?