Missing value imputation using unsupervised machine learning techniques
P. S. Raja,K. Thangavel
DOI: https://doi.org/10.1007/s00500-019-04199-6
IF: 3.732
2019-07-08
Soft Computing
Abstract:In data mining, preprocessing is one of the essential processes which involves data normalization, noise removal, handling missing values, etc. This paper focuses on handling missing values using unsupervised machine learning techniques. Soft computation approaches are combined with the clustering techniques to form a novel method to handle the missing values, which help us to overcome the problems of inconsistency. Rough K-means centroid-based imputation method is proposed and compared with K-means centroid-based imputation method, fuzzy C-means centroid-based imputation method, K-means parameter-based imputation method, fuzzy C-means parameter-based imputation method, and rough K-means parameter-based imputation methods. The experimental analysis is carried out on four benchmark datasets, viz. Dermatology, Pima, Wisconsin, and Yeast datasets, which have taken from UCI data repository. The proposed method proves the efficacy of different datasets, and the results are also promising one.
computer science, artificial intelligence, interdisciplinary applications