Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare

Amelia Ritahani Ismail,Nadzurah Zainal Abidin,Mhd Khaled Maen
DOI: https://doi.org/10.18196/jrc.v3i2.13133
2022-02-05
Journal of Robotics and Control (JRC)
Abstract:Missing data is one of the most common issues encountered in data cleaning process especially when dealing with medical dataset. A real collected dataset is prone to be incomplete, inconsistent, noisy and redundant due to potential reasons such as human errors, instrumental failures, and adverse death. Therefore, to accurately deal with incomplete data, a sophisticated algorithm is proposed to impute those missing values. Many machine learning algorithms have been applied to impute missing data with plausible values. However, among all machine learning imputation algorithms, KNN algorithm has been widely adopted as an imputation for missing data due to its robustness and simplicity and it is also a promising method to outperform other machine learning methods. This paper provides a comprehensive review of different imputation techniques used to replace the missing data. The goal of the review paper is to bring specific attention to potential improvements to existing methods and provide readers with a better grasps of imputation technique trends.
What problem does this paper attempt to address?