A Technique For Improving The Performance Of Naive Bayes Text Classification

Yuqian Jiang,Huaizhong Lin,Xuesong Wang,Dongming Lu
DOI: https://doi.org/10.1007/978-3-642-23982-3_25
2011-01-01
Abstract:Naive Bayes classifier is widely used in text classification tasks, and it can perform surprisingly well, it is often regarded as a baseline. But previous researches show that the skewed distribution of training collection may cause poor results in text classification. This paper presents a new method to deal with this situation. We introduce a conditional probability which takes into account both the information of the whole corpus and each category. Our proposed method performs well in the standard benchmark collections, competing with the state-of-the-art text classifiers especially for the skewed data.
What problem does this paper attempt to address?