Multilingual hate speech detection sentimental analysis on social media platforms using optimal feature extraction and hybrid diagonal gated recurrent neural network
Purbani Kar,Swapan Debbarma
DOI: https://doi.org/10.1007/s11227-023-05361-6
IF: 3.3
2023-05-30
The Journal of Supercomputing
Abstract:Many activities are conducted on social media platforms, such as promoting products, sharing news and sharing achievements. As a result of users’ freedom and anonymity on social media platforms, hate speech and harassment are common. Posts that spread hate and offense should be detected and deleted as soon as possible as they spread very quickly and have many negative consequences for the human race. The detection task becomes significantly more difficult when online users use code-mixed text in places where English is not the primary language. The problem of hate speech detection has recently been reduced to a binary classification task, without taking into account its topical focus and its target-oriented nature. Because there is no combined annotated dataset and scientific study that can provide insight into the relationship between offense traits, existing techniques usually only examine one or two offense traits at a time. Furthermore, these techniques are not efficient for multilingual, where most conversations are code-mixed. In this paper, we propose an optimal feature extraction and hybrid diagonal gated recurrent neural network (FE-DGRNN) for hate speech detection and sentiment analysis in multilingual code-mixed texts. The proposed FE-DGRNN technique consists threefold processes. After preprocessing, we first introduce an improved seagull optimization (ISO) algorithm for multiple features extraction from given code-mixed texts. Then, we utilize a quantum search optimization algorithm to optimize the extracted features which reduces the data dimensionality issues in further detection phase. Next, a hybrid diagonal gated recurrent neural network (Hyb-DGRNN) introduces to detect hate speech and analyzes sentiment on their language. In order to validate the effectiveness of our proposed FE-DGRNN technique, we conducted experiments on the HASOC 2019 dataset. This dataset includes posts written in English, Hindi and German, allowing us to evaluate the performance of our approach across multiple languages. From the simulation results, we observed that the accuracy of FE-DGRNN is 87.74%, 88.98% and 84.74% for Task-1, Task-2 and Task-3, respectively, for multilingual code-mixed texts dataset. Overall, the proposed FE-DGRNN technique shows a significant improvement in accuracy, precision, recall and F-measure compared to other classifiers, indicating its potential to be a robust and effective tool for hate speech detection and sentimental analysis in multilingual code-mixed texts.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture