A Bayesian Approach for Text Filter on 3G Network

Huang Jie,Huang Bei,Pu Wenjing
DOI: https://doi.org/10.1109/WICOM.2010.5601282
2010-01-01
Abstract:With the high-spread of 3rd Generation Mobile Communication Technology, on 3G network, the number of junk information has increased rapidly. Much pornographic and junk information have flooded 3G network and become a serious social problem. It becomes necessary to filter all exchanged information quickly and efficiently. An improved Bayesian filtering algorithm is proposed to classify text messages in this article, which is called the double threshold Bayesian algorithm based on minimum risk (DTBA). The method utilizes Document Frequency (DF) to select feature words. Due to the high precision rate and the low error rate of classifying text messages, it is suitable for 3G network. Two text classification approaches, such as the DTBA and the classical minimum risk-based Bayesian algorithm (MRBA), are tested in the TD-SCDMA system. As a result, the DTBA has better controllability, and the recall rate of the junk text messages can reach 95.2%. So the real-time and high efficient anti-junk messages filter can be achieved by the DTBA.
What problem does this paper attempt to address?