Multi-class Classification of Bi-lingual SMS Using Naive Bayes Algorithm

Muhammad Umair,Zarmeen Saeed,Mubashir Ahmad,Hafiz Amir,Bilal Akmal,Nisar Ahmad
DOI: https://doi.org/10.1109/inmic50486.2020.9318153
2020-01-01
Abstract:Short Message Service is the most frequently used source of portable communication via mobile phones. Privacy threats, storage issues, and time wastage have increased due to unappealing messages. Text classification is the only solution to exclude unwanted content. Support Vector Machine and Stochastic gradient descent are commonly used for text classification but they lag in accuracy and upright classification of text messages into multiple subclasses because of several hyper parameters and sensitivity to feature scaling. This results in misclassification and a rise in false-positive and false-negative rates. This paper presents a novel approach to classify English and Roman Urdu text messages using the Multinomial Naive Bayes Algorithm into multiple classes. The proposed classifier achieves better time response because it is less computationally intensive and performs equally well on less training data which makes it suitable to run on android mobile phones. Proposed solution results in 98.87% accuracy for a three-class problem.
What problem does this paper attempt to address?