Character-Based Language Modeling Approach for Spam Filtering

苏绥,林鸿飞,叶正
DOI: https://doi.org/10.3969/j.issn.1003-0077.2009.02.006
2009-01-01
Abstract:Content-based spam filtering is one of the mainstream technologies used so far.After a briefly review of the state-of-the-art of spam filtering based on content,this paper proposes a character-based language modeling approach used in spam filtering task on the basis of these technologies.We experimentally compare the performance of this approach with Nave Bayes、SVM and Word-based language modeling approach.Our experimental results show that character-based language modeling approach can achieve high performance,and can be easily applied in on-line large-scale e-mail system.
What problem does this paper attempt to address?