Data mining algorithm based on genetic algorithm and entropy

Lining Xing,Hua Tang
2007-01-01
Journal of Computational Information Systems
Abstract:Data mining is an interdisciplinary field, whose core is at the intersection of machine learning, statistics and databases. The goal of data mining is to extract the accurate comprehensible knowledge from data for supporting a decision. This paper proposes a data mining algorithm based on genetic algorithm and entropy for rule discovery called Genetic-Miner. The goal of Genetic-Miner is to discover classification rules in data sets. We have compared the performance of Genetic-Miner with other two well-known algorithms in six public domain data sets. The results showed that, concerning predictive accuracy, Genetic-Miner discovered rules with a better predictive accuracy than other two methods in these six data sets; on the other hand, Genetic-Miner has consistently found much simpler (smaller) rule lists than the other two methods. Therefore, Genetic-Miner seems particularly advantageous when it is important to minimize the number of discovered rules and rule terms (conditions) in order to improve comprehensibility of the discovered knowledge.
What problem does this paper attempt to address?