Feature Extraction for Improvement Text Classification of Spam YouTube Video Comment using Deep Learning

Jasmir Jasmir,Willy Riyadi,Pareza Alam Jusia
DOI: https://doi.org/10.29207/resti.v7i6.5249
2023-12-26
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Abstract:The proposed algorithms are Bidirectional Long Short Term Memory (BiLSTM) and Conditional Random Fields (CRF) with Data Augmentation Technique (DAT). DAT integrates spam YouTube video comments into the traditional TF-IDF algorithm and generates a weighted word vector. The weighted word vector is fed into BiLSTM CRF to capture context information effectively. The result of this study is a new classification model to spam YouTube comment videos and increase the computational value of its performance. This research conducted two experiments: the first using BiLSTM CRF without DAT and the second using BiLSTM CRF with DAT. The experimental results state that the evaluation score using BiLSTM CRF with DAT shows outstanding performance in text classification, especially in spam YouTube video comment texts, with accuracy = 83.3%, precision = 83.6%, recall = 83.3%, and F-measure = 83.3%. So the combination of the BiLSTM-CRF method and the Data Augmentation Technique is very precise, so it can be used to increase the accuracy of classification texts for spam YouTube video comments
What problem does this paper attempt to address?