Best Fit Missing Value Imputation (BFMVI) Algorithm for Incomplete Data in the Internet of Things.

Benjamin Agbo,Yongrui Qin,Richard Hill
DOI: https://doi.org/10.5220/0009578201300137
2020-01-01
Abstract:The noticeable growth in the adoption of Internet of Things (IoT) technologies, has led to the generation of large amounts of data usually from sensor devices. When dealing with massive amounts of data, it is very common to observe databases with large amounts of missing values. This is a challenge for data miners because various methods for data analysis only work well on complete databases. A popular way to deal with this challenge is to fill-in (impute) missing values using adequate estimation techniques. Unfortunately, a good number of existing methods rely on all the observed values in the entire dataset to estimate missing values, which significantly causes unfavourable effects (low accuracy and high complexity) on imputed results. In this paper, we propose a novel imputation technique based on data clustering and a robust selection of adequate imputation equations for each missing datapoint. We evaluate our proposed method using six University of California Irvine (UCI) datasets, and relevant comparison with five recently proposed imputation methods. The results presented showed that the performance of the proposed imputation method is comparable with the Local Similarity Imputation (LSI) technique in terms of imputation accuracy, but is significantly less complex than all the existing methods identified.
What problem does this paper attempt to address?