Abstract:Sentiment analysis has been used to study aspects of software engineering, such as issue resolution, toxicity, and self-admitted technical debt. To address the peculiarities of software engineering texts, sentiment analysis tools often consider the specific technical lingo practitioners use. To further improve the application of sentiment analysis, there have been two recommendations: Using pre-trained transformer models to classify sentiment and replacing non-natural language elements with meta-tokens. In this work, we benchmark five different sentiment analysis tools (two pre-trained transformer models and three machine learning tools) on 2 gold-standard sentiment analysis datasets. We find that pre-trained transformers outperform the best machine learning tool on only one of the two datasets, and that even on that dataset the performance difference is a few percentage points. Therefore, we recommend that software engineering researchers should not just consider predictive performance when selecting a sentiment analysis tool because the best-performing sentiment analysis tools perform very similarly to each other (within 4 percentage points). Meanwhile, we find that meta-tokenization does not improve the predictive performance of sentiment analysis tools. Both of our findings can be used by software engineering researchers who seek to apply sentiment analysis tools to software engineering data.

What problem does this paper attempt to address?

The paper primarily explores how to more effectively apply sentiment analysis techniques in the field of software engineering and conducts experimental research on two specific research questions: 1. **Research Question 1 (RQ1)**: Are existing deep learning-based sentiment analysis models superior to machine learning-based sentiment analysis tools? - The paper validates this by comparing the performance of five different sentiment analysis tools (including two pre-trained Transformer models and three machine learning tools) on two gold standard datasets. - The experimental results show that on one of the datasets, the pre-trained Transformer models slightly outperform the best machine learning tool, but on the other dataset, the opposite is true. Overall, the performance differences among these tools are small, usually not exceeding 4 percentage points. 2. **Research Question 2 (RQ2)**: Does replacing non-natural language elements with meta-tokenization improve the predictive performance of sentiment analysis tools? - To investigate this question, the authors used two datasets and applied meta-tokenization to the non-natural language elements within them. - The results indicate that meta-tokenization did not significantly improve the performance of any of the tested sentiment analysis tools. In summary, the main contributions of this paper are: - It validates the relative performance of deep learning models versus traditional machine learning models in the task of sentiment analysis of software engineering texts, finding that the performance differences between them are minimal. - It explores the effectiveness of meta-tokenization as a preprocessing method, concluding that it does not significantly help improve the performance of sentiment analysis tools. These findings are of certain guidance for researchers in the field of software engineering, helping them better understand how to select and apply sentiment analysis tools to analyze software engineering-related texts.

Transformers and meta-tokenization in sentiment analysis for software engineering

Evaluation of Sentiment Analysis in Finance: From Lexicons to Transformers

Incorporating Pre-trained Transformer Models into TextCNN for Sentiment Analysis on Software Engineering Texts

Text Sentiment Analysis Based on Transformer and Augmentation

Improving Sentiment Analysis over non-English Tweets using Multilingual Transformers and Automatic Translation for Data-Augmentation

Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models

Sentiment Analysis Across Languages: Evaluation Before and After Machine Translation to English

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

A Hybrid Transformer and Attention Based Recurrent Neural Network for Robust and Interpretable Sentiment Analysis of Tweets

Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion Mining

BERT-Based Sentiment Analysis: A Software Engineering Perspective

Sentiment and semantic analysis: Urban quality inference using machine learning algorithms

Text‐based sentiment analysis in finance: Synthesising the existing literature and exploring future directions

A Benchmark Study on Sentiment Analysis for Software Engineering Research

Transformer-Based Video Comment Analysis

Transformer and Multi-scale Convolution for Target-Oriented Sentiment Analysis

You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools

Optimizing Transformer based on high-performance optimizer for predicting employment sentiment in American social media content

Transformer-based approaches to Sentiment Detection

Contextual Sentence Analysis for the Sentiment Prediction on Financial Data

Dawn of the transformer era in speech emotion recognition: closing the valence gap