Abstract:This paper presents an impartial and extensive benchmark for text classification involving five different text classification tasks, 20 datasets, 11 different model architectures, and 42,800 algorithm runs. The five text classification tasks are fake news classification, topic detection, emotion detection, polarity detection, and sarcasm detection. While in practice, especially in Natural Language Processing (NLP), research tends to focus on the most sophisticated models, we hypothesize that this is not always necessary. Therefore, our main objective is to investigate whether the largest state-of-the-art (SOTA) models are always preferred, or in what cases simple methods can compete with complex models, i.e. for which dataset specifications and classification tasks. We assess the performance of different methods with varying complexity, ranging from simple statistical and machine learning methods to pretrained transformers like robustly optimized BERT (Bidirectional Encoder Representations from Transformers) pretraining approach (RoBERTa). This comprehensive benchmark is lacking in existing literature, with research mainly comparing similar types of methods. Furthermore, with increasing awareness of the ecological impacts of extensive computational resource usage, this comparison is both critical and timely.We find that overall, bidirectional long short-term memory (LSTM) networks are ranked as the best-performing method albeit not statistically significantly better than logistic regression and RoBERTa. Overall, we cannot conclude that simple methods perform worse although this depends mainly on the classification task. Concretely, we find that for fake news classification and topic detection, simple techniques are the best-ranked models and consequently, it is not necessary to train complicated neural network architectures for these classification tasks. Moreover, we also find a negative correlation between F1 performance and complexity for the smallest datasets (with dataset size less than 10,000). Finally, the different models' results are analyzed in depth to explain the model decisions, which is an increasing requirement in the field of text classification.

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

Evaluation of Text Classification Using Support Vector Machine Compare with Naive Bayes, Random Forest Decision Tree and K-NN

Chinese News Text Classification Based on Machine Learning Algorithm

Comparison of Naive Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression Classifiers for Text Reviews Classification

Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review

Comparative Study between Traditional Machine Learning and Deep Learning Approaches for Text Classification

A comparative analysis of text data classification accuracy and speed using neural networks, Bloom filter and naive Bayes

News Text Classification Algorithm Based on Machine Learning Technology

Evaluating text classification: A benchmark study

Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

An Empirical Study on the Classification of Chinese News Articles by Machine Learning and Deep Learning Techniques

Fake News Detection from Online media using Machine learning Classifiers

A Novel Text Classification Algorithm Based on Naïve Bayes and KL-Divergence

Comparative Analysis of Predictive Algorithms for Performance Measurement

Multi-class Sports News Categorization using Machine Learning Techniques: Resource Creation and Evaluation

Urdu News Content Classification Using Machine Learning Algorithms

Feature Discrimination of News Based on Canopy and KMGC-Search Clustering

A Predictive Model for Benchmarking the Performance of Algorithms for Fake and Counterfeit News Classification in Global Networks

Classifying Fake News Detection Using SVM, Naive Bayes and LSTM

Research on the Ability to Detect Fake News with Machine Learning

Comparative analysis of weka-based classification algorithms on medical diagnosis datasets