Cost-sensitive decision trees with post-pruning and competition for numeric data
Zilong Xu,Fan Min,William Zhu
DOI: https://doi.org/10.12733/jcis6488
2013-01-01
Journal of Computational Information Systems
Abstract:Decision tree is an effective classification approach in data mining and machine learning. In some applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3, such as CS-ID3, IDX, ICET and λ-ID3, have been proposed to deal with the issue. In this paper, we develop a decision tree algorithm inspired by C4.5 with post-pruning and competition for numeric data. The test cost weighted information gain ratio serves as the heuristic information while building the tree. The focus of the algorithm is the postpruning technique which considers the tradeoff between test costs and misclassification costs. In order to obtain even better results, we employ the competition approach to construct a forest and select the best tree. Experimental results indicate the effectiveness of the heuristic function, the efficiency of the post-pruning technique, and the availability of the competition approach. © 2013 Binary Information Press.