Abstract:Social media services and analytics platforms are rapidly growing. A large number of various events happen mostly every day, and the role of social media monitoring tools is also increasing. Social networks are widely used for managing and promoting brands and different services. Thus, most popular social analytics platforms aim for business purposes while monitoring various social, economic, and political problems remains underrepresented and not covered by thorough research. Moreover, most of them focus on resource-rich languages such as the English language, whereas texts and comments in other low-resource languages, such as the Russian and Kazakh languages in social media, are not represented well enough. So, this work is devoted to developing and applying the information system called the OMSystem for analyzing users' opinions on news portals, blogs, and social networks in Kazakhstan. The system uses sentiment dictionaries of the Russian and Kazakh languages and machine learning algorithms to determine the sentiment of social media texts. The whole structure and functionalities of the system are also presented. The experimental part is devoted to building machine learning models for sentiment analysis on the Russian and Kazakh datasets. Then the performance of the models is evaluated with accuracy, precision, recall, and F1-score metrics. The models with the highest scores are selected for implementation in the OMSystem. Then the OMSystem's social analytics module is used to thoroughly analyze the healthcare, political and social aspects of the most relevant topics connected with the vaccination against the coronavirus disease. The analysis allowed us to discover the public social mood in the cities of Almaty and Nur-Sultan and other large regional cities of Kazakhstan. The system's study included two extensive periods: 10-01-2021 to 30-05-2021 and 01-07-2021 to 12-08-2021. In the obtained results, people's moods and attitudes to the Government's policies and actions were studied by such social network indicators as the level of topic discussion activity in society, the level of interest in the topic in society, and the mood level of society. These indicators calculated by the OMSystem allowed careful identification of alarming factors of the public (negative attitude to the government regulations, vaccination policies, trust in vaccination, etc.) and assessment of the social mood.

KazSAnDRA: Kazakh Sentiment Analysis Dataset of Reviews and Attitudes

Uzbek Sentiment Analysis based on local Restaurant Reviews

KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech Synthesis

KazNERD: Kazakh Named Entity Recognition Dataset

Urdu Speech and Text Based Sentiment Analyzer

KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset

ASAP: A Chinese Review Dataset Towards Aspect Category Sentiment Analysis and Rating Prediction

KazQAD: Kazakh Open-Domain Question Answering Dataset

An Emotion based Sentiment Analysis on Twitter Dataset

A structured sentiment analysis dataset based on public comments from various domains

KurdiSent: a corpus for kurdish sentiment analysis

Sentiment Analysis of Persian Language: Review of Algorithms, Approaches and Datasets

RuSentiTweet: a sentiment analysis dataset of general domain tweets in Russian

Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST

Dibenzonaphthyridinones: Heterocycle-to-Heterocycle Synthetic Strategies and Photophysical Studies.

On the development of an information system for monitoring user opinion and its role for the public

A Cross-Validation Study of Turkish Sentiment Analysis Datasets and Tools

ScenarioSA: A Large Scale Conversational Database for Interactive Sentiment Analysis

Sentiment Analysis of Consumer Reviews: Unveiling Perspectives and Building a Machine Learning Model for Product Evaluation

A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline

PerSent: A Freely Available Persian Sentiment Lexicon.