Using Instance Cloning to Improve Naive Bayes for Ranking
Liangxiao Jiang,Dianhong Wang,Harry Zhang,Zhihua Cai,Bo Huang
DOI: https://doi.org/10.1142/s0218001408006703
IF: 1.261
2008-01-01
International Journal of Pattern Recognition and Artificial Intelligence
Abstract:Improving naive Bayes (simply NB)(15,28) for classification has received significant attention. Related work can be broadly divided into two approaches: eager learning and lazy learning.(1) Different from eager learning, the key idea for extending naive Bayes using lazy learning is to learn an improved naive Bayes for each test instance. In recent years, several lazy extensions of naive Bayes have been proposed. For example, LBR,(30) SNNB,(27) and LWNB.(8) All these algorithms aim to improve naive Bayes' classification performance. Indeed, they achieve significant improvement in terms of classification, measured by accuracy. In many real-world data mining applications, however, an accurate ranking is more desirable than an accurate classification. Thus a natural question is whether they also achieve significant improvement in terms of ranking, measured by AUC (the area under the ROC curve).(2,11,17) Responding to this question, we conduct experiments on the 36 UCI data sets(18) selected by Weka(12) to investigate their ranking performance and find that they do not significantly improve the ranking performance of naive Bayes. Aiming at scaling up naive Bayes' ranking performance, we present a novel lazy method ICNB (instance cloned naive Bayes) and develop three ICNB algorithms using different instance cloning strategies. We empirically compare them with naive Bayes. The experimental results show that our algorithms achieve significant improvement in terms of AUC. Our research provides a simple but effective method for the applications where an accurate ranking is desirable.