Performance Evaluation of Machine Learning Classifiers in Sentiment Mining

Vinodhini G Chandrasekaran RM

DOI: https://doi.org/10.48550/arXiv.1402.3891

2014-02-17

Abstract:In recent years, the use of machine learning classifiers is of great value in solving a variety of problems in text classification. Sentiment mining is a kind of text classification in which, messages are classified according to sentiment orientation such as positive or negative. This paper extends the idea of evaluating the performance of various classifiers to show their effectiveness in sentiment mining of online product reviews. The product reviews are collected from Amazon reviews. To evaluate the performance of classifiers various evaluation methods like random sampling, linear sampling and bootstrap sampling are used. Our results shows that support vector machine with bootstrap sampling method outperforms others classifiers and sampling methods in terms of misclassification rate.

Machine Learning,Computation and Language,Information Retrieval

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to evaluate the performance of different machine - learning classifiers in sentiment mining tasks, especially for sentiment classification of online product reviews. Specifically, the author hopes to experimentally compare the performance of four common machine - learning classifiers (decision tree, K - nearest neighbor, Naive Bayes, and support vector machine) under different sampling methods (random sampling, linear sampling, and bootstrap sampling) to determine which combination of classifier and sampling method has the lowest misclassification rate in the sentiment classification task. ### Main problems and objectives of the paper: 1. **Sentiment mining task**: Classify online product reviews according to sentiment tendencies (such as positive or negative). 2. **Classifier selection**: Select four common machine - learning classifiers (decision tree, K - nearest neighbor, Naive Bayes, and support vector machine) for performance evaluation. 3. **Influence of sampling methods**: Evaluate the influence of different sampling methods (random sampling, linear sampling, and bootstrap sampling) on the performance of classifiers. 4. **Data sources**: Use review data of five different products (camera, mobile phone, iPod, laptop, and music player) collected from Amazon. 5. **Performance indicators**: Measure the performance of classifiers by misclassification rate and ensure the reliability of results through cross - validation. ### Core contributions of the paper: - It is proved that the support vector machine (SVM) combined with bootstrap sampling shows the lowest misclassification rate in all tested product categories. - Analyze the influence of different sampling methods on the performance of classifiers and find that bootstrap sampling is significantly better than other sampling methods. - Provide future research directions in the field of sentiment mining and suggest further exploration of the application of ensemble learning and genetic algorithms. ### Summary: The main purpose of this paper is to evaluate the performance of different machine - learning classifiers and sampling methods in sentiment mining tasks through empirical research, and provide theoretical basis and technical guidance for practical applications.

Performance Evaluation of Machine Learning Classifiers in Sentiment Mining

Sentiment Classification for Chinese Reviews: a Comparison Between SVM and Semantic Approaches

Performance Evaluation of Classification Algorithm for Movie Review Sentiment Analysis

Sentiment Classification based on Machine Learning Approaches in Amazon Product Reviews

Performance Assessment of Multiple Classifiers Based on Ensemble Feature Selection Scheme for Sentiment Analysis

Thumbs up? Sentiment Classification using Machine Learning Techniques

Sentiment Classification Based on Extreme Learning Machine with Linear Kernel

Evaluation of Sentiment Data using Classifier Model in Rapid Miner Tool

Sentiment Analysis in Online Product Reviews: Mining Customer Opinions for Sentiment Classification

Sentiment Analysis of Short Texts Using SVMs and VSMs-Based Multiclass Semantic Classification

Using Machine Learning to Predict the Sentiment of Online Reviews: A New Framework for Comparative Analysis

Performance Evaluation and Comparison using Deep Learning Techniques in Sentiment Analysis

Performance Investigation of Feature Selection Methods

Data preprocessing approach for machine learning-based sentiment classification

Sentiment Analysis on Movie Reviews

Sentiment Analysis of Customer Reviews on E-commerce Platforms: A Machine Learning Approach

Sentiment Analysis of Keenly Intellective Smart Phone Product Review Utilizing SVM Classification Technique

Sentiment Analysis of Consumer Reviews: Unveiling Perspectives and Building a Machine Learning Model for Product Evaluation

Machine Learning Based Sentiment Text Classification for Evaluating Treatment Quality of Discharge Summary

Mining of Customer Review Feedback Using Sentiment Analysis for Smart Phone Product

Experimental Study on Sentiment Classification of Chinese Review Using Machine Learning Techniques