Automatic Coding for Neonatal Jaundice From Free Text Data Using Ensemble Methods

Scott Werwath
DOI: https://doi.org/10.48550/arXiv.1805.01054
2018-05-03
Abstract:This study explores the creation of a machine learning model to automatically identify whether a Neonatal Intensive Care Unit (NICU) patient was diagnosed with neonatal jaundice during a particular hospitalization based on their associated clinical notes. We develop a number of techniques for text preprocessing and feature selection and compare the effectiveness of different classification models. We show that using ensemble decision tree classification, both with AdaBoost and with bagging, outperforms support vector machines (SVM), the current state-of-the-art technique for neonatal jaundice coding.
Computation and Language
What problem does this paper attempt to address?