Fake News Detection on Hindi News Dataset

Sudhanshu Kumar,Thoudam Doren Singh
DOI: https://doi.org/10.1016/j.gltp.2022.03.014
2022-04-01
Global Transitions Proceedings
Abstract:With the increase in social networks, more number of people are creating and sharing information than ever before, many of them have no relevance to reality. Due to this, fake news for various political and commercial purposes are spreading quickly. Online newspaper has made it challenging to identify trustworthy news sources. In this work, Hindi news articles from various news sources are collected. Preprocessing, feature extraction, classification and prediction processes are discussed in detail. Different machine learning algorithms such as Naïve Bayes, logistic regression and Long Short-Term Memory (LSTM) are used to detect the fake news. The preprocessing step includes data cleaning, stop words removal, tokenizing and stemming. Term frequency inverse document frequency(TF-IDF) is used for feature extraction. Naïve Bayes, logistic regression and LSTM classifiers are used and compared for fake news detection with probability of truth. It is observed that among these three classifiers, LSTM achieved best accuracy of 92.36%.
What problem does this paper attempt to address?