Abstract:Data dimensionality reduction is usually carried out before patterns are input to classifiers. In order to obtain good results in data mining, selecting relevant data is desirable. In many cases, irrelevant or redundant attributes are included in data sets, which interfere with knowledge discovery from data sets. In this paper, we propose a rule-extraction method based on a novel separability-correlation measure (SCM) ranking the importance of attributes. According to the attribute ranking results, the attribute subsets that lead to the best classification results are selected and used as inputs to a classifier, such as an RBF neural network in our paper. The complexity of the classifier can thus be reduced and its classification performance improved. Our method uses the classification results with reduced attribute sets to extract rules. Computer simulations show that our method leads to smaller rule sets with higher accuracies compared with other methods.

Data dimensionality reduction with application to improving classification performance and explaining concepts of data sets.