A Two-Stage Spam Email Filtering Method Based on Naive Bayes and Hierarchical Clustering

LIAO Ming-tao,ZHANG De-yun,LI Jin-ku
DOI: https://doi.org/10.3969/j.issn.1000-7180.2007.08.001
2007-01-01
Abstract:To reduce misclassification rate of legitimate emails,proposed a two-stage spam email filtering method based on naive Bayes and hierarchical clustering. This method classifies emails as Legitimate,Unsure and Spam. At first stage,it classifies email as Legitimate and Unsure by using naive Bayesian classifier. At second stage,a hierarchical clustering method is used to find similar email in the pre-collected spam emails set. The experiment showed that,this method can increase the precision of spam detection,lower the misclassification of legitimate emails,which is more viable in practice.
What problem does this paper attempt to address?