Data quality measurement on categorical data using genetic algorithm

J. Malar Vizhi,T. Bhuvaneswari
DOI: https://doi.org/10.48550/arXiv.1202.3215
2012-02-15
Abstract:Data quality on categorical attribute is a difficult problem that has not received as much attention as numerical counterpart. Our basic idea is to employ association rule for the purpose of data quality measurement. Strong rule generation is an important area of data mining. Association rule mining problems can be considered as a multi objective problem rather than as a single objective one. The main area of concentration was the rules generated by association rule mining using genetic algorithm. The advantage of using genetic algorithm is to discover high level prediction rules is that they perform a global search and cope better with attribute interaction than the greedy rule induction algorithm often used in data mining. Genetic algorithm based approach utilizes the linkage between association rule and feature selection. In this paper, we put forward a Multi objective genetic algorithm approach for data quality on categorical attributes. The result shows that our approach is outperformed by the objectives like accuracy, completeness, comprehensibility and interestingness.
Databases
What problem does this paper attempt to address?