Abstract:Sentiment analysis adalah teknik komputasi text mining berbasis natural language processing (NLP) untuk mengekstraksi pendapat seseorang yang diungkapkan dalam platform online, termasuk dalam platform microblogging Twitter, salah satu platform microblogging yang paling popular digunakan di Indonesia. Ada dua pendekatan yang umum digunakan dalam teknik sentiment analysis yaitu pendekatan berbasis machine learning (ML) dan pendekatan berbasis sentiment lexicon (SL). Fokus penelitian ini adalah untuk pengembangan teknik sentiment analysis berbasis machine learning yang disebut juga teknik tersupervisi pada dataset Twitter. Sebagian besar sentiment analysis pada dataset Twitter berbahasa Indonesia mengandalkan single machine learning algorithm . Penelitian ini menggabungkan kinerja berbagai algoritma/experts seraya mengurangi tingkat kesalahan klasifikasi dengan meng-update bobot secara dinamis menggunakan weighted majority vote (WMV) berbasis joint distribution dari Bayesian Network. Pada tahap pertama, data di grabbing dari Twitter dengan 3 hashtag terkait Covid-19 sebagai data eksperimen. Selanjutnya kinerja weighted majority vote secara ekstensif dibandingkan dengan 4 metode baseline sebagai pembanding, yaitu: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Naïve Bayes dan Majority Vote dari ketiga single classifier tersebut. Metrics kinerja yang digunakan adalah precision, recall, fmeasure, accuracy dan Mathews correlation coeficient (MCCC). Dalam eksperimen, terbukti bahwa WMV mampu meningkatkan kinerja sentiment analysis pada ketiga topik dataset dengan evaluator berbagai metrics kinerja sentiment analysis. Abstract Sentiment analysis is a computational text mining technique based on natural language processing (NLP) to extract someone's opinion expressed in online platforms, including the Twitter microblogging platform, one of the most popular microblogging platforms used in Indonesia. There are two approaches that are commonly used in sentiment analysis techniques, namely the machine learning (ML) based approach and the sentiment lexicon (SL) based approach. The focus of this research is the development of machine learning-based sentiment analysis techniques which are also called supervised techniques on the Twitter dataset. Most of the sentiment analysis on the Indonesian language Twitter dataset relies on a single machine learning algorithm. This study combines the performance of various algorithms/experts while reducing the level of misclassification by updating the weights dynamically using a joint distribution-based weighted majority vote (WMV) from the Bayesian Network. In the first stage, data was grabbed from Twitter with 3 hashtags related to Covid-19 as experimental data. Furthermore, the performance of the weighted majority vote was extensively compared with 4 baseline methods for comparison, namely: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Nave Bayes and Majority Vote from the three single classifiers. Performance metrics used are precision, recall, fmeasure, accuracy and Mathews correlation coeficient. In experiments, it is proven that WMV is able to improve sentiment analysis performance on the three dataset topics with various evaluators of sentiment analysis performance metrics.

Implementation of The Indonesian Language Stemming Algorithm in Twitter Data Preprocessing. Case Study: Twitter Wargabanua and Instakalsel

Stemmer and phonotactic rules to improve n-gram tagger-based indonesian phonemicization

Analysis of language identification algorithms for regional Indonesian languages

Hybrid Models for Emotion Classification and Sentiment Analysis in Indonesian Language

Joint Distribution pada Weighted Majority Vote (WMV) untuk Peningkatan Kinerja Sentiment Analysis Tersupervisi pada Dataset Twitter

Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter

Location-based Twitter Filtering for the Creation of Low-Resource Language Datasets in Indonesian Local Languages

Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection

Alih Kode dan Campur Kode Pada Akun Twitter @Marnombois

TOWARDS CURBING CYBER-BULLYING IN MALAYSIA BY AUTHOR IDENTIFICATION OF IBAN AND KADAZANDUSUN OSN TEXT USING DEEP LEARNING

Identify User Behavior based on Tweet Type on twitter Platform using Mean Shift Clustering

A comparison of text weighting schemes on sentiment analysis of government policies: a case study of replacement of national examinations

Exploration of Opinion Movers Interaction Patterns in the Social Network Analysis Method on Twitter (Case Study: keyword kabinet indonesia maju)

An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach

Strategy For Increasing the Brand Reputation of The PLN Mobile Application Based on Social Media Sentiment Analysis Using Machine Learning

Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia

Deteksi Depresi dan Kecemasan Pengguna Twitter Menggunakan Bidirectional LSTM

Overview of Stemming Algorithms for Indian and Non-Indian Languages

An efficient preprocessing method for supervised sentiment analysis by converting sentences to numerical vectors: a twitter case study

Architecture of Text Mining Application in Analyzing Public Sentiments of West Java Governor Election using Naive Bayes Classification

Implementasi dan Analisis Model Machine Learning Decision Tree untuk Deteksi Akun Palsu di Twitter