Mining Method for Data Quality Detection Rules

LIU Bo,GENG Yin-Rong
DOI: https://doi.org/10.3969/j.issn.1003-6059.2012.05.015
2012-01-01
Pattern Recognition and Artificial Intelligence
Abstract:Data quality rules are key to the database quality detection. To discover data quality rules from relational databases automatically and detect the error or abnormal data based on them,the form and evaluation measures of data quality rules are studied,and criterions of computing data quality rules are presented based on data item groups and the confidence threshold. The algorithms of mining minimal data quality rules and the main idea of detecting data errors using data quality rules are also given. The new form of data quality rules makes use of confidence mechanism of association rules and the expression of conditional functional dependencies to describe functional dependencies, conditional functional dependencies and association rules in the same format. It can be concluded that this kind of data quality rules has the properties of conciseness,objectivity,completeness and accuracy of detecting the error or abnormal data. Compared with other related research work,the proposed algorithms have lower temporal complexity,and the discovered quality rules improve the detecting rate. The effectiveness and correctness of the proposed methods are proved by the experiments.
What problem does this paper attempt to address?