Abstract:In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles. We study sentiment analysis capabilities of instruction-tuned large language models (LLMs). We consider the dataset of RuSentNE-2023 in our study. The first group of experiments was aimed at the evaluation of zero-shot capabilities of LLMs with closed and open transparencies. The second covers the fine-tuning of Flan-T5 using the "chain-of-thought" (CoT) three-hop reasoning framework (THoR). We found that the results of the zero-shot approaches are similar to the results achieved by baseline fine-tuned encoder-based transformers (BERT-base). Reasoning capabilities of the fine-tuned Flan-T5 models with THoR achieve at least 5% increment with the base-size model compared to the results of the zero-shot experiment. The best results of sentiment analysis on RuSentNE-2023 were achieved by fine-tuned Flan-T5-xl, which surpassed the results of previous state-of-the-art transformer-based classifiers. Our CoT application framework is publicly available:

What problem does this paper attempt to address?

The paper attempts to address the problem of Targeted Sentiment Analysis (TSA) in Russian news texts. Specifically, the researchers explored the following points: 1. **Sentiment Analysis using Large Language Models (LLMs)**: - Investigated the ability of decoder-based generative transformers to extract sentiment towards named entities in Russian news articles. - Evaluated the sentiment analysis capabilities of instruction-tuned large language models (LLMs) in zero-shot and few-shot settings. 2. **Dataset and Experimental Design**: - Conducted experiments using the RuSentNE-2023 dataset, which contains Russian texts annotated with sentiment. - The experiments were divided into two parts: the first part assessed the zero-shot capabilities of LLMs with different transparency levels ("closed models" and "open models"); the second part involved fine-tuning experiments using the Flan-T5 model combined with the Three-Hop Reasoning (THoR) framework. 3. **Model Comparison and Performance Improvement**: - Compared the performance of various LLMs (such as GPT-4, GPT-3.5, Mistral, DeciLM, etc.) in zero-shot settings and found that these models performed worse on Russian texts compared to their performance on translated English texts. - The fine-tuned Flan-T5 model demonstrated excellent performance under the THoR framework, surpassing the previous state-of-the-art encoder-based classifiers. 4. **Error Analysis**: - Analyzed the main types of discrepancies between model predictions and human annotations, including misjudgment of compound sentiment sentences (E1), incorrect sentiment direction towards multiple entities (E2), and sentiment recognition errors for single entities (E3). In summary, this paper aims to improve the accuracy and robustness of sentiment analysis tasks by enhancing the application of large language models in Russian targeted sentiment analysis.

Large Language Models in Targeted Sentiment Analysis

Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models

The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models

The Inner Sentiments of a Thought

Large Language Models aren't all that you need

A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM

Deciphering Political Entity Sentiment in News with Large Language Models: Zero-Shot and Few-Shot Strategies

A comparative study of cross-lingual sentiment analysis

HinglishNLP: Fine-tuned Language Models for Hinglish Sentiment Detection

Zero- and Few-Shot Prompting with LLMs: A Comparative Study with Fine-tuned Models for Bangla Sentiment Analysis

Sentiment Analysis in the Era of Large Language Models: A Reality Check

RuSentNE-2023: Evaluating Entity-Oriented Sentiment Analysis on Russian News Texts

LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking

Do Large Language Models Possess Sensitive to Sentiment?

Sentiment Analysis through LLM Negotiations

Investigating Decoder-only Large Language Models for Speech-to-text Translation

LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study

Improving Results on Russian Sentiment Datasets

Analyzing the Role of Semantic Representations in the Era of Large Language Models

TransLLaMa: LLM-based Simultaneous Translation System

Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study