An Experimental Survey of Missing Data Imputation Algorithms

Xiaoye Miao,Yangyang Wu,Lu Chen,Yunjun Gao,Jianwei Yin
DOI: https://doi.org/10.1109/tkde.2022.3186498
IF: 9.235
2022-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Due to the ubiquity of missing data, data imputation has received extensive attention in the past decades. It is a well-recognized problem impacting almost all fields of scientific study. Existing imputation algorithms differ in problem settings, model selection, and data evaluation. There is a lack of systematic comparison study among imputation algorithms. In this paper, we survey this interesting and evolving research topic by broadly reviewing and experimentally comparing the state-of-the-art missing data imputation algorithms. We analyze and categorize 19 imputation algorithms. Extensive experiments over 15 real-world benchmark datasets are conducted under various settings of data types, missing mechanisms, missing rates, dataset/model parameters, as well as the post-imputation prediction task. We shed light on a series of constructive insights on imputation algorithms to tackle imputation problem in real-life scenarios. Moreover, we put forward promising future directions for data imputation problem.
What problem does this paper attempt to address?