Modeling Information Quality Risk In Data Mining

Ying Su,Donghong Li,Jie Peng
DOI: https://doi.org/10.1109/WiCom.2008.2424
2008-01-01
Abstract:Information quality (IQ) is a critical factor in the Abstract success of the Data mining (DM). Therefore. it is essential to measure the risk of IQ in a data warehouse to ensure success in implementing DM. This paper presents a methodology to determine two IQ characteristics-accuracy and comprehensiveness-that are of critical importance to decision makers. This methodology can examine how the quality risks of source information affect the quality for information outputs produced using the relational algebra operations selection, projection, and Cubic product. It can be used to determine how quality risks associated with diverse data sources affect the quality of the derived data. The study resulted in the development of a model of a data cube and an algebra to support IQ Risk operations on this cube. The model we present is simple and intuitive, and the algebra provides a means to concisely express complex DM queries.
What problem does this paper attempt to address?