Handling missing data through deep convolutional neural network

Hufsa Khan,Xizhao Wang,Han Liu
DOI: https://doi.org/10.1016/j.ins.2022.02.051
IF: 8.1
2022-05-01
Information Sciences
Abstract:The presence of missing data is a challenging issue in processing real-world datasets. It is necessary to improve the data quality by imputing the missing values so that effective learning from data can be achieved. Recently, deep learning has become the most powerful type of machine learning techniques, which can be used for discovering the hidden knowledge that exists in a large dataset to make accurate predictions. In this paper, we propose an imputation method that involves using a convolutional neural network to impute the missing values. The missing value of each instance is imputed essentially by using a trained kernel. The weights of the kernel are determined by learning from the given data that are arranged spatially in the data matrix. The kernel carries out a weighted sum of neighboring elements in an array for imputing the missing values. In addition, in the absence of the true values with which the missing values are expected to be replaced, a loss function is designed without the need to know the true value. Our method is evaluated on UCI datasets in comparison with state-of-the-art methods. The experimental results show that the proposed approach performs closely to or better than other methods.
computer science, information systems
What problem does this paper attempt to address?