TRIP: an Interactive Retrieving-Inferring Data Imputation Approach

Zhixu Li,Lu Qin,Hong Cheng,Xiangliang Zhang,Xiaofang Zhou
DOI: https://doi.org/10.1109/tkde.2015.2411276
IF: 9.235
2015-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1].
What problem does this paper attempt to address?