Predicting hospital length of stay using machine learning on a large open health dataset

Raunak Jain,Mrityunjai Singh,A. Ravishankar Rao,Rahul Garg
DOI: https://doi.org/10.1186/s12913-024-11238-y
2024-07-30
BMC Health Services Research
Abstract:Governments worldwide are facing growing pressure to increase transparency, as citizens demand greater insight into decision-making processes and public spending. An example is the release of open healthcare data to researchers, as healthcare is one of the top economic sectors. Significant information systems development and computational experimentation are required to extract meaning and value from these datasets. We use a large open health dataset provided by the New York State Statewide Planning and Research Cooperative System (SPARCS) containing 2.3 million de-identified patient records. One of the fields in these records is a patient's length of stay (LoS) in a hospital, which is crucial in estimating healthcare costs and planning hospital capacity for future needs. Hence it would be very beneficial for hospitals to be able to predict the LoS early. The area of machine learning offers a potential solution, which is the focus of the current paper.
health care sciences & services
What problem does this paper attempt to address?