New approaches to missing biomedical data recovery for machine learning

Victor Iapascurta,Ion Fiodorov
DOI: https://doi.org/10.52326/jes.utm.2023.30(1).09
2023-04-01
Journal of Engineering Science
Abstract:Missing data is a common problem for medical data sets, especially large ones. This issue is of major importance since it can influence the analysis and further use of the data, e.g., for machine learning purposes. There are various methods for recovering missing data.One such method is to remove observations with missing values, but this is not very usefulgiven the limited amount of data available. Another commonly used approach is the LastObservation Carried Forward (LOCF). But most such methods are not universal and may need adjustments to the data set at hand. This article describes the possibility of solving this problem in the case of multimodal time series of biomedical data coming from patients with sepsis. It describes and compares three approaches tailored to a sepsis dataset, which is analyzed and finally used to build a sepsis prediction system based on clinical data routinely recorded in an intensive care unit.
What problem does this paper attempt to address?