Takagi-Sugeno Modeling of Incomplete Data for Missing Value Imputation with the Use of Alternate Learning

Xiaochen Lai,Liyong Zhang,Xin Liu
DOI: https://doi.org/10.1109/access.2020.2991669
IF: 3.9
2020-01-01
IEEE Access
Abstract:Missing values often occur in real-world datasets, which undermines the data integrity and reduces the reliability of data mining. In this paper, a method of Takagi-Sugeno (TS) fuzzy modeling for incomplete data is proposed and utilized to estimate missing values. Considering the difference of attribute relationship within different clusters, this method performs regression analysis on the subsets obtained by fuzzy clustering and constructs the global model with the weighted sum of regression models, which describes the relationship between attributes more precisely on the basis of traditional regression imputation. Meanwhile, focusing on the problem of incomplete model input caused by missing values, we propose an alternate learning strategy to train model parameters and imputations, which treats missing values as variables to drive the advance of incomplete data modeling and updates imputations with the adjustment of model parameters. Through the alternate learning strategy, not only the problem of incomplete model input is well solved, but also the accuracy of the model and the performance of imputation are improved together in a collaborative way. Experimental results on several UCI datasets with different missing ratios and missing data mechanisms demonstrate the effectiveness of the proposed method and strategy.
What problem does this paper attempt to address?