APT-KNN:AN EFFICIENT MISSING VALUE IMPUTATION METHOD ORIENTED TOWARD CLASSIFICATION ISSUE

Xu Yuming,Chen Cheng,Xiong Yun,Zhu Yangyong
DOI: https://doi.org/10.3969/j.issn.1000-386X.2011.04.040
2011-01-01
Abstract:Classification is one of the common data mining methods.However,one common data quality problem in classification process is attribute value missing,and missing data imputation can reduce the effect on the classification errors caused by the attribute value missing.Missing data imputation requires high accuracy first,and it shall ensure higher computation efficiency in many practical applications as well.In this paper,we present a new imputation method for missed attribute value – APT-KNN,it makes use of the relations among the attributes and estimates the missing value according to a couple of instance attribute values which are most similar to the object,so as to guarantee higher accuracy of the imputed results.At the same time,an optimised AntiPole-Tree index structure is designed,which improves the efficiency of missed attribute values imputation.Experiments show that APT-KNN outperforms several current methods of missed attribute imputation on efficiency and accuracy.
What problem does this paper attempt to address?