A Method of Chinese Spam Filtering Based on Suffix Array Clustering (SAC)

Xiang-Ying LI,Zhong CHEN,Li-Yong TANG,Xin LI
DOI: https://doi.org/10.3969/j.issn.1002-137X.2006.05.027
2006-01-01
Computer Science
Abstract:The naive-bayes algorithm has widely been applied to spam filtering.However,it has unsatisfactory perform- ance in Chinese email filtering.Using clutering,this paper proposes a suffix array clustering based token extraction method for Chinese email,named SAC.It also shows the different filtering results of bayes under different token ex- traction methods.The experiments domenstrate the improvement of filtering performance of the method for Chinese spam.
What problem does this paper attempt to address?